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PREFACE 

This final report presents results of an investigation carried out at 
the Environmental Research Institute of Michigan (ERIM) , to analyze the 
SKYLAB S-192 multispectral scanner data and to assess the utility of 
special (unresolved object and signature extension) processing and infor- 
mation extraction techniques for the remote sensing of Earth resources. 

The research covered in this report was performed under Contract 
NAS9-13280 and covers the period between March 1973 and September 1975. 

During this period Mr. L. B. York has been Technical Monitor for NASA. 

Expenses for the preparation of data and some of the processing were shared 
by this contract and ERIM’s subcontract to Michigan State University's 
Contract NAS9-13332 which utilized data collected over the same test site. 

The program was directed by R. R. Legault, Vice-President of ERIM; J. D. 

Erickson, Head of the ERIM Information Systems and Analysis Department; 
and R. F. Nalepka, Principal Investigator and Head of the ERIM Multispectral 
Analysis Section. W. A, Malila was Co-Principal Investigator, The ERIM 

number for this report is 101900-61-F , 

Part of this investigation was to test information extraction techniques 

for SKYLAB S-192 data, and compare those results with results obtained from 
processing LAHDSAT and aircraft multispectral scanner data as well. Unfortu- 
nately, the Southeast Michigan test site was cloud covered during every 
LANDSAT-1 pass from June to September, so it was impossible to obtain LANDSAT 
data over the test site during some time in the growing season that would 
in some way be comparable to the S-192 data set being studied, thus it was 
not possible to comparably process lANDSAT data for this investigation. 

The authors wish to thank Dr. L. V. Manderscheid of Michigan State 
University, East Lansing for making available the ground information for the 
test site. Special acknowledgement is due to R. B. Crane and J. Gleason of 
the ERIM Multispectral Analysis Section (MAS) staff for their technical 
assistance and suggestions on the data misregistration studies which were carried out 


ii 


yFniH 

FORMERLY WILLOW RUN LABORATORIES. THE UNIVERSITY OF MICHIGAN 


The technical work for the study of effects of misregistration (Section 4) 
was conducted by R, Cicone and the signature extension work (Section 5) 
was carried but by P. Larabeck. Numerous other MAS technical staff 
members contributed to the success of this Investigation as well. 
Throughout this contract period secretatial assistance has been provided 
by Ms. D. Dickerson, L. Parker, G. Sotomayor and E. Hugg. 
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SUMMARY 

The objective of this Investigation was to examine the utility of 
special processing techniques as applied to Sky lab S-192 data for the 
automatic extraction of resource information. These special processing 
techniques include signacure extension algorithms to extend the 
applicability of signatures over distance, time, and/or measurement 
conditions and mixture classifiers to estimate proportions of spatially 
unresolved objects. As a part of this Investigation, S-192 data gathered 
over Southeast Michigan were analyzed and three sites were studied 
1) a 90 square mile agricultural area in Ingham County, 2) an urban and 
rural area in the vicinity of Lansing, and 3) an urban and rural area in the 
vicinity of Ypsllanti. 

Upon receipt of the data we examined the data quality, investigating 
in each SDO (Scientific Data Output) signal-to-nolse characteristics and 
dynamic range. Aircraft scanner data gathered over the agricultural site 
the morning of S-192 data collection were examined also and used as a basis 
for comparison. The results of the examination of S-192 data quality were 
essentially in keeping with the published S-192 performance evaluations [4], 

A conclusion reached was that all spectral bands had a very limited range of 
values in relation to the noise content of the data; four of the bands were 
sufficiently noisy so as to be of doubtful use in classification processing. 
Also examined was the spatial registration of the scanner data. The 
SDO-to-SDO misregistration in conic data was measured and shown to be 
greater than one pixel in some instances. More importantly, further analysis 
showed that the effect of scan— line-straightening was to compound and 
Increase the misregistration of the S-192 data: a maximum misregistration 

of 2.2 pixels was calculated. Not only is the misregistration of scan-llne- 
straightened data not easily correctable but the additional misregistration 
seriously reduces the number of pure pixels available for training. 
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Analytical and simulation studies were then performed to investigate the 
effects of misregistration on classification accuracy. The results showed that, 
for pixels which Imaged more than one ground class in one or more channels, 
the error rate was substantial and increased as the degree of misregistration 
increased. Also shown was that, while the correct classification rate for pure 
(one class) pixels did not change significantly as misregistration increased, 
the number of such pure pixels markedly decreased as misregistration increased. 
Because of the Increased, uncorrectable misregistration in scan-line-straightened 
data, the recognition processing for this contract was carried out with conic 
data. Using the conic data, we were able to substantially correct for 
misregistration by selecting a set of 13 SDOs (one for each band) and shifting 
some relative to others such that the maximum misregistration was one third of 
a pixel. 

In preparation for recognition processing of the agricultural test site 
using conventional techniqmis,, a set of training statistics was extracted 
using a supervised clustering method applied to pure (one class) pixels from 
half the area. In this manner, several recognition signatures were defined 
for each classj- the number depending on actual physiological and physical 
phenomena as well as on economic designations. Having established the signatures, 
the utility of the 13 spectral bands for recognition processing in the agricultural 
area was determined. Using a computer algorithm which computed the average 
pairwise prohability of misclasslf icatlon, the 13 bands were rank ordered with 
the result that the four bands previously identified as having poor signal 
quality were adjudged to be among the worst bands. The two best bands, by far were 
1,55-1,73 pm (SDO 12) and 0.93-1.05 pm (SDO 19). The result of classifying 
the agricultural site using conventional techniques and the 7 best bands 
provided an overall correct classification rate of 75% for the pure (one-class) 
pixels for the local (training) area and 63% for the nonlocal (test) area. A 
second measure of performance, the overall estimation of class proportions, 
was based on aggregated classification counts of all pixels in the area. 

These results, which are given as the root mean square error of the estimates 
summed over all classes, were Ea4.7% for the local area and E-6.9% for the 


non-local area. 
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In both cases, great confusion was noted in a triad of corn, trees and 
brush. The classification of the data was affected by a combination of the 
limited signal range in the data and the apparent spectral similarity of many 
of the ground classes. The latter effect was attributed to the contrast 
reducing effect of atmospheric haze and the fact that, at the time of year the 
data was collected, there was a large range of conditions for several classes 
(e.g., some of the corn had tasseled and some had not) leading therefore to 
added spectral similarity among classes. The errors in the proportion 
estimation were also affected by the large number of mixture (more than one 
class) pixels in the scene. A brief study indicated that more than 70% of the 
scene was composed of such mixture pixels. In general a disproportionate 
number of such pixels were classified as corn, resulting in a substantial 
overestimation of corn in the scene. 

The utility of signature extension techniques for S-192 data was tested 
using the Lansing and Ypsilantl sites for training and test, respectively. 

Signature extension techniques are potentially useful for reducing costs and 
data processing time for large area surveys and are an important part of 
multispectral data processing. Several signature extension techniques developed 
at ERIM for use on LANDSAT and/or aircraft data were utilized to process data 
for the signature extension test site located some 70 miles from the signature 
extension training area. The test area was chosen particularly because a layer 
of haze covering this site was very evident in the S-190B imagery; thus, this 
was a test under very different atmospheric conditions as well as a test over 
distance. Training statistics were gathered using an unsupervised clustering 
technique and clusters were identified for urban, residential, vegetation, 
water, concrete, bare soil and sparse vegetation. A classification attempt without 
the use of signature extension techniques resulted in poor accuracy while the 
use of signature extension techniques improved classification accuracy. The 
best results were obtained using the dark object algorithm. In a qualitative 
sense these results matched those obtained using local clusters (i.e., clusters 
generated at the signature extension site). 
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Further classification was carried out on both training sites previously 
mentioned using the other special classifier. This classifier, the unresolved 
object or mixtures classifier, first identifies each pixel as being either pure 
or a mixture of several classes and, if it is a mixture, estimates the proportions 
of pure ground covers in that resolution element. Such a classifier would seem 
to be well suited to a data set where more than 70% of the pixels were mixture 
pixels. The results of using this approach on both sites was unsatisfactory, 
due apparently to the previously mentioned limited signal range, contrast and 
and spectral discriminability of the data. Thus, no general conclusions were 
drawn with regard to the utility of the mixtures classifier on S-192 data. 

Results of this Investigation indicate that deficiencies in the S-192 
data will tend to limit its ultimate utility. To minimize deleterious effects 
of channel-to-channel misregistration in any future use of S-192 data, use of 
conic format data is recommended. Furthermore, the design of future multlspectral 
scanner and data processing systems should take into account the experience 
gained in processing and analyzing S-192 data. To this end, two recommendations 
are made. First, finer spatial resolution should be considered for future 
sensors; this would alleviate the problems caused by having a large proportion 
of mixture pixels in the scene and the attendant problem of having so few 
pure pixels on which to base training statistics. The second recommendation 
is that future systems provide a means to adjust scanner gain and offset 
parameters to better match the radiance characteristics of Individual scenes 
and thus make fuller use of the available scanner dynamic range. 
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1 

INTRODUCTION 

Remote sensing of earth resources using multispectral scanners and 
automatic information extraction techniques has been shown over the past 
several years to be a feasible and viable tool for providing information 
required by resource managers in many disciplines. Early multispectral 
scanners used low-flying aircraft platforms for data collection. In 1972, 
multispectral remote sensing systems became spaceborne with the launching 
of the first LANDSAT (initially called the Earth Resources Technology 
Satellite). As it steadfastly orbited the earth, it was capable of pro- 
viding information from four broad spectral bands. Moreover, its orbit 
characteristics allowed it to overfly the same site every 18 days, allowing 
for timely collection of data as well as enabling the use of temporal 
information. 

SKYLAB, the first U.S. orbiting manned space station, carried as part 
of its payload a new multispectral scanner, designated as sensor S-192. 

The S-192 is more sensitive spectrally than LANDSAT, having 13 bands across 
the visible, near-infrared and thermal-infrared portions of the spectrum. 

The purpose of this contract was to analyze the data collected with the 
S-192 and adapt previously developed information extraction techniques for 
such data, especially in regard to problems associated with signature 
extension and subresolution element classification. Signature extension 
techniques potentially provide the ability to use training data from one scene 
gathered under different conditions. Subresolution element classification 
refers to techniques designed to estimate the proportion of the constituent 
ground covers in resolution elements containing two or more different ground 


covers. 
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The data used in this study were acquired on August 5» 1973 st approxi- 
mately 10:02 EST (15:02 GMT) over an area of Southeast Michigan stretching 
between Lansing and Detroit. A five by 18 mile rural area comprising the 
townships of Locke, Leroy and White Oak in Ingham County was designated as 
the agricultural intensive study site for the contract, and detailed ground 
information for this area was collected. The principal ground covers of 
the test site are corn, various pastures, grasses, wheat stubbles and 
weeds, dense woodlots, scrub and brush areas and bare soil. Appendix 
III more fully describes the test site. As for the weather on the morning 
of the overflight, a nonuniform haze layer was covering this test site 
area according to ground observers. 

Subsidiary data other than that collected by the S-192 that were used 
for this study Include imagery from the SKYLAB EREP S-190A multiband camera 
and the S-190B fine resolution camera, screening film (video presentation) 
of each channel of the S-192, and 9-inch false color infrared photography 
acquired by a high altitude aircraft. Also, at the time of S-192 data acquisition, 
the ERIM C-47 aircraft carrying the ERIM M-7 12-band multispectral scanner 
made repeated passes over the test site, collecting MSS data from several 
altitudes. Color, false color IR, and black and white IR photography were 
also collected by the C-47 during these underflights. LANDSAT data for the area 
for the 1973 growing season was unavailable since the area was cloud covered 
on all passes of that satellite. 

For this study, SKYLAB S-192 data were obtained in two formats: scan- 
line-stralghtened data and unstraightened or conic format data. In both 
cases the data were radiometrlcally corrected and had been processed at 
Johnson Space Center to reduce the effects of low and high frequency noise. 

The remaining sections of this report discuss many aspects of the 
investigation in some detail. In Section 2 the analysis of the S-192 data 
is discussed. Questions related to signal-to-nolse» dynamic range, and 
band-to-band registration are addressed and S-192 and M-7 signal characteristics 
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are compared. Section 3 presents and analyzes classification results 
achieved over the Southeast Michigan agricultural test site. The effects 
of channel-to-channel misregistration as determined via simulation are 
discussed in Section 4. The classification results achieved in applying 
signature extension and subresolution element processing techniques are 
described in Sections 5 and 6, respectively. Section 7 provides conclusions 
and recommendations arising out of the investigation. 
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2 

DATA QUALITY ANALYSIS 


2.1 INTRODUCTION 

This section discusses the SKYLAB S-192 multispectral scanner and 
the quality of the data recorded from it. A thorough understanding of 
the workings of the scanner and the characteristics of the resultant data 
is necessary to successfully process the data and interpret the results. At 
the end of the section, the S-192 data characteristics are compared to those 
of another scanner, the aircraft mounted ERIM M-7 scanner. 

2.1.1 DESCRIPTION OF S-192 

The S-192 is extensively described elsewhere [1] and in Appendix I; 
here we describe it briefly to introduce the concepts necessary for under- 
standing the material in this report. 

The S-192 has a conical scan, with a scan frequency of 94.79 scans /seconds, 
using only the forward 116° of the scan for obtaining earth resources infor- 
mation. The scanner Instantaneous field of view is 0.182 mrad (approximately 
81 m on the ground at spacecraft orbital altitude) and successive scan lines 
overlap by about 10%. The data are over samp led by 10% along the scan line 
as well, so the effective pixel (picture element) size is about 72m x 72m. 

Data over the Southeast Michigan test site were collected at an altitude of 
441,429 meters. While the data are originally collected as conic scan lines, 
i.e., along an arc of a circle, the data are processed at NASA/Johnson Space 
Center to produce scan- line-straightened data to conform with the majority 
of data display forms. This aspect of the data will be extensively discussed 
later . 

There are 13 detectors on the S-192, and the wavebands covering the 
visible, near infrared, and thermal infrared portions of the spectrum are 
listed in Appendix I. The data from each detector were sampled and recorded 
in the SDOs (channels) noted in the table. Each pixel, or picture element. 
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contains 22 SDOs. The data were sampled so that the 13 detectors produced 
22 channels as follows? for all odd numbered SDOs, their detectors signals 
were sampled at the same instant during the scan, time t. For all even 
numbered SDOs, the appropriate detectors were simultaneously sampled at 
a later instant, (t + At)i one half pixel along the scan line after time t. 

Thus eight of the detectors are sampled twice for each data point, and the 
thermal detector odd SDO sample is recorded both in SDO 15 and SDO 21. 

Those detectors which are sampled twice for each pixel (e.g., SDOs 1 and 2) 
are referred to as high sample rate bands while the other detectors are 
referred to as low sample rate bands. 

The following sections will describe and analyze the data from the 
S-192 in terms of signal characteristics, spatial registration and resolu- 
tion, and will discuss their impact on processing of S-192 data. 

2.1,2 SIGNAL CHARACTERISTICS 

The processing of the S-192 data was begun by analyzing the information 
content of the data channels. While we are interested in using differences 
in reflectance (and/or emittance and temperature) characteristics to dis- 
criminate between the ground covers of interest, the data values recorded 
on the tape are only indirectly related to the ground reflectance (or theraal) 
characteristics, being acted upon by the atmosphere, the sensor optics and 
electronics, and the digitizing electronics [2]. Here we consider just the 
effect of the system electronics on the radiant energy collected by the 
scanner. In the end, the desired output from a system of this sort are 
signals which, for different object classes, are distinct enough to allow 
classification of the data based upon pattern recognition techniques. The 
components of the system effects which can be analyzed and discussed are 
the sensitivity and linearity of the individual detectors and the detector 
output utilization of the dynamic range of values available. There is also 
the consideration of system noise, especially in relation to the signal levels 
being output by the detectors. Finally, the apparent registration of the system 
should also be inspected. 
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The first part of this section describes various measurements made 
on the S-192 data. Where it is difficult or impossible to derive absolute 
measures for several of these components, measures of the S-192 relative 
to those of another multispectral scanner will allow us to obtain a better 
feel for the performance of the S-192. The remainder of this section, 
therefore , discusses these same measurements as made on the ERIM M-7 air- 
craft mounted scanner data and the results for the two scanners are compared. 

To begin with, screening imagery and digital gray scale printouts 
(graymaps) of each SDO were visually analyzed for noise characteristics. 

It was seen that while most channels appeared to be of good quality, three 
detectors (SDOs 5,6; 7,8; 18) contained a high degree of scan-line dependent 
noise and two detectors (SDOs 15,16,21; 22) were so noisy that there was no 
visible structure in the data. By scan line dependent noise we mean that 
striations along the arc of the conic scan were quite prominent. Figure 2.1 
displays a piece of screening film for one SDO from each group, and Figure 2.2 
indicates the portions of the spectrum covered by the three groups of SDOs. 

As noted, the two detectors in the last group are the thermal band, 

10. - 12.68 ym and the .41 - .45 ym band respectively. In the case of tlve 
thermal band, it has been reported [3] that the noise equivalent temperature 
for the thermal detector for this data set is 2.6°K. It is entirely possible 
that) this noise level exceeds the temperature changes occurring in tho^ scene 
so that there is essentially no information in this band. It was further 
noted that one of the noisier detectors is the ,66 - .73 ym band which covers 
the region of chlorophyl absorption. This is unfortunate since this band is 
usually a key band in the processing of multispectral data for agricultural 
areas. It was also noted from viewing these graymaps that most roads and 
other features useful for location of fields were not readily evident. 
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FIGURE 2.1. SCREENING FILMS FOR S-192 DATA. Examples of clear data, some scan line noise, very noisy data 
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Following this, a more analytic analysis of S-192 signal quality 
was carried out. The first, dynamic range of the data, was obtained by 
examining histog?;ams of pixels from an area 600 lines long by 700 points 
wide. The area sampled included urban, water, forested and rural areas 
and included the agricultural test site. The results, as listed in 
Table 2.1, were tabulated in two ways. In examining the histograms it 
was not clear at which point on the tails one no longer had data but 
rather was just viewing infrequent noise spikes. Accordingly two rules 
were used for determining the range: for the first, the limit was taken 

as occurring at the first empty bin of tha hlstogramj for the second, the 
data values between the tenth and 90th percentile were used. In the latter 
case, we are looking at the range of values for 80%, or most, of the data. 
Here the dynamic range was between 6% and 12% of the available range of 
256 counts; no SDO had more than 5 bits of significance. 

To obtain a fuller picture of the situation, these results need to 
be compared to the noise content in the data, as well as the separability 
of signals representing different ground classes. 

Measuring the noise characteristics of the scanner, i.e., noise from 
electronic sources not including scene dependent sources, requires analyzing 
data from a uniform reflector. The closest thing to a uniform reflector 
that the data set included was a large lake. We developed statistics, means 
and standard deviations in each SDO, from the pixels of the lake. We use 
them here with the following strictures. Because of weeds and other sus- 
pended vegetation in the water, patches of shallow water and some atmospheric 
back-scattering at the blue end of the visible spectrum, the estiioates of the. 
noise given by the standard deviation will be greater than the true condition 
At longer wavelengths these effects are diminished and the accuracy of the 
estimate improves. In Table 2.2 we present the mean and standard deviation 
measured; the signal :noise calculated is the ratio of these two quantities. 
One further measure, range :noise, is the ratio of the dynamic range to the 


9 


2pi 


FORMERLY WILLOW RUN LABORATORIES. THE UNIVERSITY OP MICHIGAN 


TABLE 2.1e S-192 DATA QUALITY ANALYSIS: 

DYNAMIC RANGE IN COUNTS 

In each column, the first entry indicates the data values, the 
second is the number of counts. 


DETECTOR 

SDOs^ 

DYNAMIC 
RANGE 2 

DYNAMIC 

RANGE3 

1 

22 

76-126, 

50 

89-105, 

16 

2 

18 

^90-140, 

50 

95-117, 

22 

3 

1.2 

48-98, 

50 

57-70, 

13 

4 

3,4 

18-71, 

53 

29-42, 

13 

5 

5,6 

14-81, 

67 

29-48, 

19 

6 

7.8 

37-110, 

73 

67-86, 

19 

7 

9,10 

21-126, 

105 

64-94, 

30 

8 

19 

41-125, 

84 

75-105, 

30 

9 

20 

22-123, 

101 

74-97, 

23 

10 

17 

16-118, 

102 

73-96, 

23 

11 

11,12 

13-99, 

86 

44-63, 

19 

12 

13,14 

4-95, 

91 

20-42, 

22 

13 

21,15,16 

126-177, 

51 

140-156, 

16 


Maximum Range Available: 0-255 

^For the doubly sampled detectors, results were calculated 
for both SDOs and found to be in agreement — as would be 
expected. Hence they are reported together. 

2 

Used continuous rule 
^Used 10% to 90% rule 

^Trimodal distribution; reported is the major distribution 
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TABLE 2.2 S-1.92 DATA QUALITY ANALYSIS 

SIGNAL :NOISE 


DETECTOR 

SDOs^ 

SIGNAL MEAN 

1 

22 

95.8 

2 

18 

102.4 

3 

1,2 

56.5 

4 

3.4 

29.3 

5 

5,6 

32.4 

6 

7,8 

42.1 

7 

9,10 

26.4 

8 

19 

23.5 

9 

20 

26.1 

10 

17 

18.1 

11 

11,12 

14.4 

12 

13,14 

10.3 

13 

21,15,16 

144.3 


STANDARD 



DEVIATION 

SIGNAL ;N01SE 

RANGE: NOISE 

5.6 

17.1 

2.9 

11.7 

8.8 

1.9 

2.8 

20.1 

4.6 

2.8 

10.5 

5.4 

5.2 

6.2 

3.7 

4.8 

8.8 

4.0 

8.3 

3.2 

3.6 

5.0 

4.7 

6.0 

6.4 

4.1 

3.6 

5.3 

3.4 

4.3 

3.3 

4.4 

5.8 

4.4 

2.3 

5.0 

4.8 

30.1 

3.3 


^For the doubly sampled detectors, results were calculated for both 
SDOs and found to be In agreement — as would be expected. Hence they 
are reported together. 
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noise and indicates the number of ‘'noise as" wide the data range is in each 
band. For this calculation we used the dynamic range according to the 10% 
rule, since we are interested in the majority of the data points. It is 
noteworthy that the bands specified as exceedingly noisy at the beginning of 
the analysis have the lower range :noise values, although their signalrnoise 
may be good. 

To complete i-he analysis of signal quality we would like to get a feel 
for the detector sensitivity. However, it will be possible to do this only 
in a relative sense. By locating, for different ground classes, the same areas 
on the ground in both the S-192 and M--? data sets, signatures may be calculated 
and either distances between the distributions or probabilities of misclassifi- 
cation (the degree of overlap between pairs of signatures) may be calculated 
and used to compare the separability of signals between the two scanners. 

Two areas* were located in the test site and signatures, mean and covariance 
matrices, were calculated from the pixels in each field. These two fields were 
chosen solely because they were the two largest occurrences of different classes 
that appeared in both the S-192 and the M-7 data sets. We wanted the largest 
fields possible so as to have a sufficient number of pixels in the S-192 data 
set and thus to well estimate the signatures for these fields. The corn field 
was very large and as a result 59 field center pixels were identified and used 
for the signature calculation. The woodlot, on the other hand, was not small 
but still only nine field-center pixels could be identified for the woodlot. 

To measure the distance between the signatures we have chosen to calculate a 
form of the Bhattacharyya distance. The distance calculated was 



^ .. .... 

Fields chosen were a large corn field in Section 16 > Leroy Township 
and a large hardwood woodlot in Section 4, Leroy Township. 

The full form of the Bhattacharyya- distance is: 
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where y and y are the mean vectors of corn and trees respectively 
jL 

and and are the covariance matrices for corn and trees, 

U X 

To enable us to analyze the situatiou even more closely, we calculated 
for each channel or 


■ 1 . 2 . 2 . 
2 


and the results are given In Table 2.3 below. Obviously, the larger the 
distance calculated, the greater the separation between the two distributions. 


TABLE 2.3 DISTANCE BY DETECTOR BETWEEN CORN FIELD 
AND WOODLOT FOR S-192 


TECTOR 

SDOs 


1 

22 

0.45 

2 

18 

0.47 

3 

1,2 

1.42 

4 

3,4 

0.06 

5 

5, ,6 

0.0003 

6 

7,8 

3.3 

7 

9,10 

4.42 

8 

19 

5.20 

9 

20 

3.03 

10 

17 

0.20 

11 

11,12 

3.3 

12 

13,14 

0.40 

13 

21,15,16 

0.05 
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Readily apparent is the large disparity in the table’s values. In 
general those bands which had been identified as being noisy have very small 
distance values (D^), the exception being band 6 (.67 - .73 ym) which is in the 
spectral region of chlorophyl absorption. Other bands with small distances 
merely indicate very little separability in these bands for these two object 
classes. In Section 2.3 these results were compared to those obtained by 
analysing signatures of these same two areas calculated from the M-7 data set. 

This provides some measure of how well or how poorly the distributions are 
separated in the S— 192 data. 

In summary, S-192 data has been analyzed in several ways and has been 
shovm to have limited signal range, especially in relation to the system 
noise. By the word system is meant the combined optics and electronics of 
the data collection facility and also the data preprocessing facility. 

Conclusions on how accurately such data can be classified, however, are 
not easily drawn from this information. By comparing S-192 data character- 
istics to those of another multispectral scanner we may obtain a better 
understanding of the situation. In Section 2.3, such a comparison is made. 

2.1.3 SPATIAL REGISTRATION 

Multispectral remote sensing and multivariate analysis have at their 
core the concept that many channels of information regarding one data point 
(pixel) can be used to more accurately classify it. One necessary condition, 
obviously, is that all the channels of information used muse refer to the same 
point or condition. For example, if most channels of a pixel of multispectral 
data image an area of class 1, while some other of the channels image an area of 
Class 2, it may not be possible to correctly classify the pixel. Thus, all 
channels of information must be spatially registered, i.e., all Imaging the same 
area on the ground, if one is to achieve good results. If the data are seriously 
misregistered it may be possible to process the data in such a way as to substantially 
correct the problem. We analyzed both the conic data and the scan-line-straightened 
data for misregistration. It turned out that the scan line straightening procedure 
further misregisters the data so that the conic data are more registered than the 
scan-line-straightened data. Details of two analyses are presented below. 
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2. 1.3.1 Misregistration in Coni c Data 

By the S-192 system design, all even-numbered SDOs are perfectly registered 
one with the other; the same is true for all odd numbered SDOs. Further, there 
is a one-half pixel misregistration between the odd numbered SDOs and the even 
numbered SDOs due to the sampling technique used. Further misregistration is 
introduced by scanner electronics, by different response times for different detec- 
tors, and/or by improperly skewed record heads on the spacecraft tape recorder. 

These combine to produce the misregistration observed in the conic data. 

Misregistration in conic data, i.e., misregistration caused by scanner and 
related recording electronics, has been documented by Braithwaite and Lambeck [3,4]. 
The measurements carried out were made using scans of the lunar surface and were 
accurate to a quarter pixel. These results showed that SDOs 17, 19, and 20 lagged 
one half pixel from where they would be expected. To investigate the registration 
properties of the data set being processed, a short manual investigation was 
carried out for the conic data set utilizing the fact that significant reflective 
changes occur at land/water interfaces in many of the bands. These results were 
in agreement with those cited above, again with a quarter pixel error in measurement 
To obtain more precise answers as to what the spatial misregistration charac- 
teristics of the conic data were, a more analytical technique was developed. The 
technique used is thoroughly presented in Appendix IV; here we summarize it 
briefly so that the discussion of the results will be understandable. 

To determine the misregistration between two channel , the cross correlation 
was determined over a range of fractional pixel shifts. The cross-correlation 
function then has a maximum at the shift representing the actual misregistration. 
Initial tests of the method indicated that the values near the peak closely 
approximated a quadratic curve. To obtain a more accurate estimate of the shift 
at which the peak actually occurs, a quadratic curve was fit to the three shift 
values nearest the peak. From the coefficients of this curve, the peak of the 

cross-correlation function was estimated. 

Table 2.4 contains the estimated misregistration between 17 of the original 
22 Skylab SDOs. The SDOs (15,16,18,21,22) which do not appear in the table were 
not used in this investigation, because they were not sufficiently correlated 
with any other channels to obtain meaningful results . 
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The misregistration was not actually determined by direct measurement for 
all of the pairs of channels represented in the table. The misregistration 
was first measured between seven pairs of even and odd numbered high sample 
rate SDOs (1-2, 3-4, 5-6, 7-8, 9-10, 11-12, 13-14). In all cases, the average 
measurement taken over 5 lines of data was almost exactly 0.5. These measure- 
ments indicated that the misregistration between these pairs of channels could 
be safely assumed as being 1/2 pixel. Measurements were made using 10 lines of 
conical data on an additional seventeen pairs of correlated (p > .5 for a large 
sample of pixels) channels chosen from among the odd numbered high sample rate 
channels and the remaining low sample rate channels. A multiple linear 
regression was performed on these seventeen measurements to obtain estimates 
of the misregistration between nine pairs of channels from which estimates of 
all of the remaining pairs were derived. The sum of the squared deviations 
between the 17 actual measurements and their predicted values from the 
regression analysis was 0.0015. This low figure indicates the consistency of 
the results obtained from the different pairs of channels. As a further test, 
measurements of the misregistration between nine pairs of channels taken from 
a different set of 10 lines, were also made. The sum of the squared deviations 
between these measured values and the values shown in Table 2.4 was 0.0067 . 

To determine the misregistration between any two pairs of channels from 
Table 2.4, find the fractional pixel value in the table corresponding to the 
desired pair of channels. The sign of the entry in the table denotes the 
direction the channel given by the column must be shifted to register it with 
the row channel. Positive is defined as in the direction of scan and negative 
as the opposite direction. For example, channel 1 lags channel 2 and channel 2 
also leads channel 3. 

Results indicate that the algorithm which was developed is, in fact, quite 
accurate. The measurements made on the even and odd numbered high sample rate 
SDOs yielded the exact results expected . The measurements made on the 17 pairs 
of channels were consistent among themselves. The standard deyiation of each 
of these estimates over the 10 lines of data which were employed were also 
quite small (less than .05 pixels) . Measurements made on the second set of 10 
lines were also consistent with those obtained from the first set of lines. 
These results indicate that the method is reliable. 
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Furthermore, it is possible to tbs t antially correct for the misregistration 
for conic data, and to define a set of 10 SDOs, one for each detector called 
out in Section 2.1.3 as being useable, which are fairly well registered. This 
may be done by first shifting SDOs 17 and 19 one pixel in the scan direction 
relative to the other SDOs and then choosing the even numbered high sample rate 
SDOs 2, 4, 6, 8, 10, 12, and 14, and finally SDO 20 along with SDOs 17 and 19. 

The next aspect of this discussion is to consider the effect of misregistered 
conical data on the scan -line -straightened data. 

2. 1.3. 2 Spatial Misregistration for Scan Line Straightened Data 

In the previous section we discussed the existence and extent of spatial 
misregistration in conic data. In this section we examine it for scan- 
line-straightened data and also examine the effects of the scan-line- 
straightening algorithm on spatial misregistration. It is shown that even 
in the absence of scanner-related misregistration, serious misregistration is 
created in the data by the scan-line-straightening algorithm. 

For this analysis, it was not possible to use the cross correlation 
technique from the previous section because the technique requires some 500 
continuous points on each scan line to be used in the algorithm to reduce 
boundary effects and these 500 pixels must have identical misregistration 
characteristics. That this last condition does not occur in the scan-line- 
straightened data will be evident from the discussion below. 

By the S-192 system design, all even-numbered SDOs are perfectly 
registered one with the other; the same is true for all odd numbered SDOs. 
Further, there is a one-half pixel misregistration between the odd numbered 
SDOs and the even numbered SDOs due to the sampling technique used. Further 
misregistration is introduced by scanner electronics, by different response 
times for different detectors, and/or by improperly skewed record heads on the 
spacecraft tape recorder. These combine to produce the misregistration observed 
in the conic data. 
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When the scan-line-straightening algorithm rearranges the collected pixels 
into scan-line-straightened format, additional spatial misregistration is 
introduced* The following example gives a graphic account of the randomness 
of the resulting misregistration and the possible extent of it. Presented below, 
in Figure 2.3, are two pixels each from two consecutive conic scan lines and 
the manner in which they are assigned to a straightened scan line. 


Conic Scan Line 



+ • + N 


+ 


4 . ^ A and B are the centers 

of scan-line-straightened 
pixel a, odd and even SDOs, 
respectively. 


# Center of odd numbered SDO 
+ Center of even numbered SDO 


FIGURE 2.3. assignment OF SDOs IN SCAN-LINE -STRAIGHTENING 


To begin the analysis, let us break the 22 SDOs into four subsets and examine 
each independently# The four subsets are: 1) ODD numbered LOW sample rate 

SDOs, 2) EVEN numbered LOW sample rate SDOs, 3) ODD numbered HIGH sample rate 
SDOs and 4 ) EVEN numbered HIGH sample rate SDOs. It is assumed that all SDOs in 
a subset will be assigned in the same way; this is so since the the assignment 
algorithm as well as the starting point on a scan line is the same for all SDOs. 


19 


2pi 


FORMERLY WILLOW RUN LABORATORIES, THE UNIVERSITY OF MICHIGAN 


All ODD numbered, LOW sample rate SDOs from pixel J in line N will be 
assigned to A of scan-line-straightened pixel a (A being the center of the 
resolution cell for the ODD SDOs of a) , Similarly all EVEN numbered LOW sample 
rate SDOs from pixel I, scan line 0 will be assigned to B. (B being the even 
numbered SDOs of pixel a.) 

When the high sample rate SDOs are straightened, the odd-even pair 
of SDOs for each detector are interleaved, then the samples are assigned to 
straightened lines and points and rebroken into an odd-even SDO pair again. 

Thus for this example, all EVEN numbered, HIGH sample rate SDOs from pixel I, scan 
line N, will be assigned to A and renamed to be the ODD numbered SDOs of pixel a. 
Similarly all ODD numbered HIGH sample rate SDOs from pixel J scan line 0 will be 
assigned to B and become the EVEN numbered SDOs for pixel a. 

Within each of the two cases (paragraphs) cited above, the low sample 
rate and the high sample rate groups, the misregistration between the even 
SDOs and the odd SDOs will be that as found in the conic data -- for the 
along scan line direction* In the along track direction for the example in 

Figure 2*3 there will be one full pixel misregistration due just to the scan- 
line-straightening. This is the maximum that could be created for this partic- 
ular effect. 

The misregistration between a set of high sample rate SDOs and a set of 
low sample rate SDOs is indeterminate since it depends on whether or not the 
even-odd designation for the high sample rate SDOs in the straightened format 
has been switched from what it was in the conic format. Potentially, the along 
scan misregistration between low and high sample rate SDOs can be one whole pixel. 

The above discussion has referred only to misregistration caused by the 
sampling scheme and the scan-line-straightening procedure. The occurrence of 
scanner electronic related misregistration is in addition to that cited above. 

This additional misregistration in the conic data is only along the scan line. 

In the scan-line-straightened data its direction is still along the tangent to 
the conical scan at the point of interest* We can state the total expected 
misregistration in scan-line-stralghtened data as: 
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= 1 + M sin9 (pixels) 

Ry = 1 + M COS0 (pixels) 

where ; 

R is the component of misregistration in the 
straightened data along the scan line 

R is the component of misregistration in the 

straightened data in the along track direction 

M is the maximum misregistration in the conic data 

0 is the angle between the line tangent to the conic 

scan at the point being considered and a line in 
the along track or flight direction. 


This result will be used in the next section to show how misregistration 
affects the processing of data. 

Another observation regarding misregistration in scan-line-straightened 
data is that it is not possible to correct the data, at least not using a simple 
algorithm as was used in the conic data. Further, it is not possible even to 
correct within any one of thve four subsets previously cited, so that misregistration 
due to scanner electronics could be reduced even within one of the subset 
groups of SDOs. That this is the case may be easily shown by using Figure 2.4 
below. 
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FIGURE 2.4. SCAN-LINE STRAIGHTENING WITH MISREGISTERED DATA 


In the figure pixels A, B, E, and F will be assigned sequentially to a 
straightened scan line. Assume that one SDO, SDO k, is one pixel out of 
registration- .with the other SDOs. Thus SDO k of pixel B images the area of 
pixel A, and SDO k of pixel E images the area of pixel D. Any attempt to 
simply shift, for the scan line straightened data, SDO k one pixel relative 
to all the other SDOs will result in SDO k of pixel B being the area of 
pixel, D, and not pixel C as would be correct. It is possible that such a 
shifting technique would reduce the misregistration in some pixels, but it would 
Increase the misregistration for other pixels and, more importantly, it would 
not be possible to know exactly which pixels were correct and which were not. 

It is not possible, in general, to predict where these discontinuities might 
occur as it is a function of spacecraft altitude, velocity, and heading. In 
general, it can be stated that these discontinuities will occur as frequently as 
every pixel, at the ends of the scan lines, and falling off as one moves toward 
the middle of the scan line to a frequency of about every 15^^ or 18*^*^ pixel at 
the point on the scan line directly ahead of the spacecraft. 

Finally, it is clear that the increased misregistration caused by the 
scan-line-stralghtening algorithm results in fewer pure field center pixels than 
for the conic data and in many more pseudo mixture pixels, i.e, , pixels which 
have some SDOs -itnag-ing field center areas and other SDOs imaging field boundaries 
or even completely different fields. Even the mixture pixels will image different 
proportions of the classes in different SDOs. 
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A thorough analysis of the effects of misregistration on the processing 
of S-192 data is presented in Chapter 4. 

The effects of misregistration due to the scan-line-straightening algorithm 
on S-192 data may be stated succinctly. 

1. There is greatly increased misregistration in scan-line- 
straightened data over conic data. 

2. Scanner-caused misregistration between any pairs of channels 
r.iay not be easily corrected for in scan— line-straightened data. 

3. Scan-line-straightened data will have fewer pure field center 
pixels than will conic data. 

2.1.4 S-192 RESOLUTION AND THE IDENTIFICATION OF FIELD CENTER PIXELS 

The resolution of the S-192 scanner for the spacecraft altitude at the 
time the Southeast Michigan data set was collected, yielded a resolution cell 
almost 81 meters square, or about 0.65 hectares (1.6 acres) per resolution 
element. Especially for this test site, where the average agricultural field 
size is 15-18 acres, many of the resolution elements in the scene will be 
imaging two or more fields. Obviously, in extracting training statistics it is 
impor*;ant that the data points used be only those data points which are purely 
of the class being considered. Thus a need evolves to Identify pure data 
points, or as they are more commonly called: field center points. The complement 

of the field center point is called a mixture data point. 

Thus far, the discussion has dealt with resolution elements and not pixels. 

A pixel is not, for the S-192 (and generally speaking), the same as a resolution 
element. A pixel, or picture element, refers to one data point, one vector of 
observation, sampled from the detector outputs. The S-192 system oversamples 
by approximately 10% along the scan direction, and the overlap between 
successive conical scans is also about 10% at the midpoint in the scanners 
front field of view. 

The ground size of a pixel is given, to be 72 x 72 meters [1] . A brief 
analysis of actual pixel size was conducted in the following manner, Pairs of 
pixels in lakes were located on scan-line-straightened data graymaps. : Care 
was taken to find pairs which were either on the same scan line, several hundred 
pixels apart, or located at the same scan point numb'?:?.- several hundred scan lines 
apart. Points corresponding to the pixels selected tfcce also located on USGS 
maps of Southern Michigan. Distances were accurately measured on the USGS maps 
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and on the graymaps; the result was that the pixels were measured to be 69 

meters wide in the along-scan direction and 72 meters in the along track 

direction. Calculations based on geometrical considerations using only the 

angle of the scan cone and the altitude at the time of data acquisition 

yielded measures of 70 x 70 meters. The differences are not felt to be serious. 

Having defined resolution elements, pixels, pure pixels and mixture pixels, 

the rest of this discussion is devoted to a procedure for identifying field 

center pixels. Preparatory to this, it should be understood that at ERIM 

individual fields are usually defined by the set of points S = {(x^,yjj^), 

X = line number of vertex i, y. = point number at vertex i} which are the 
i 1 

vertices of a generalized polygon which is the boundary of the field. 

Simply speaking, identification of field center pixels is accomplished 
by the inscribing of a smaller similar polygon with the polygon which defines 
the field being considered, A pixel is identified as a field ceuLer pixel if its 
center is within the inscribed polygon. The distance the field center polygon 
is inset from the original is calculated so that even in the worst case all 
the pixels in the field center polygon are guaranteed to be resolving only 
areas within the field. It is Important to remember here the distinction between 
pixel size and the size of the resolution cell. 

In general, the inset calculation is a summation of many components, and 
in fact the inset may be different in the direction of scan than in the along 

{l ,I }) as follows; 

pixels 

where 

a indicates x; scan direction or y: line or along track 
direction 

D is the size of the resolution cell in the direction of a 

a 

is the size of the picture element in the direction of a 


track direction. We can generalize the inset (I: 


I = — 

a P 

GL 


B + R + L + S 

ot 
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B is the inset necessary to insure that the pixel does not 

include the boundary between fields. Typically B = 0.5 pixel. 

R is the error due to misregistration effects, e.g., if one 
^ channel is misregistered from the others by R pixels, then 
this channel could still be imaging across the field boundary 
when the other channels are imaged entirely within the field. 
For conic data corrected for misregistration, 

R^ = 0.32 

For straightened data, from the previous section we have 
Rjj * 1 + MSIN0 
R = 1 + MCOS0 

y 

from Table 2.4, M Is found to be 1.13. To develop one 
measure for the whole scan line, we take the maximum values of 
SIN0 and COS0, which Is one. 

Thus: 

R^ = 2.13 
Ry - 2.13 

L is due to any field location errors which may have occurred. 

S is the error due to ^movement" of individual pixels as a result 

of the nearest neighbor scan line straightening. For conic 
data, therefore, S = 0. For straightened data, S « 0.5 pixel. 

Thus, the inset to be used for conic data would be: 

0,5 + 0,92 4* L « .90 + L pixels 


0.5 + 0. + L - .58 + L pixels 

while the inset to be used for scan-line-straightened data would be: 

I = = ly 0.5 + 2.13 + L + 0.5 - 3.21 + L pixels 
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The significant increase in inset for the second case here is due to 
the increased misregistration between SDOs found in scan-line-straightened 
data. Figure 2.5 illustrates the use of insets for these two cases. In the 
conic data case a 20 pixel field has eight certain field center pixels while 
a field of 75 scan-line-straightened data pixels has only six pure field 
center pixels. 

In summary, what has been presented here is the inset to be used to insure 
that any pixel identified as a pure, field center pixel is resolving only 
one ground class in all of its bands. This insures that the training 
statistics will refer only to pure conditions of the classes they represent. 

2.1.5 PROCESSING CONIC DATA 

The bulk of the processing carried out for this study was done on conic, 
not scan- line-straightened data. Using conic data meant that the misregis- 
tration in the data was not compounded by the s can-line-straightening algorithm. 
More importantly, it meant that remedial algorithms, as described at the end 
of Section 2. 1.3.1, could be and were employed to signif leant ly reduce the 
misregistration in the data. 

The drawback to using conici data is that graymaps of individual SDOs are 
somewhat distorted. For most of our work, however, such graymaps proved 
adequate. For instances where undistorted maps were desired, a special 
implementation of the digital mapping program was used. In this mode, the data 
to be presented are broken into groups of 40-55 pixels for which the conical 
arc over those points can be approximated by a straight line. Then each swath 
of data are mapped, the symbols being printed diagonally on the printer, 
incrementing one print line every n characters. Additionally, conic pixels 
falling at points where there is overlap in the undistorted printed map can be 
deleted. While this does not produce as rectified a map as the scan-line^ 
straightening algorithm employed at JSC, graymaps generated in this manner are 
only slightly distorted. On the whole, we found the mechanics of data 
manipulation when using conic data to be little different than when working 
with straightened data. 
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2.2 M-7 MULTISPECTRAL SCANNER 

The ERIM M-7 multispectral scanner is an aircraft mounted line scanner 
with the capability of recording data in 12 wavebands from the ultraviolet to 
the thermal infrared region of the spectrum [6]. Appendix II lists the 
spectral and optical characteristics of the M-7 scanner. For this study, the 
M~7 was used to acquire data over the test site at the same time as the S-192 
data for this study was collected [7]. The M-7 data were acquired at an altitude 
of 2000 ft,* collecting data along several parallel North-South passes over the 
test site. The data used for this study was Run 2 over flight 1. Run 2 began 
at 1056 hrs. E.D.T., and ended at 1105 hrs. E.D.T., while the S-192 data set was 
acquired at 1102 hrs. E.D.T. 

2.2.1 M-7 SIGNAL CHARACTERISTICS 

The M-7 data were digitized and preprocessed as described in Appendix V 
and the data were then analyzed for signal characteristics. The analysis was 
carried out along the same lines as that for the S-192 , 

The dynamic range analysis is presented in Table 2.5 and the signal ;noise analysis 
in Table 2.6. For this latter table, the only water body in the M-7 data set was 
a small farm pond. For the sake of rigorous comparison to the procedures of 
Section 2,1 the statistics derived from it are presented. However we should 
point out that since it is a very small water body and probably much shallower 
than Lake Lansing, the estimated noise for the M-7 is probably larger than the 
actual noise characteristic of the data. For the M-7 data, however, we can 
obtain a very accurate estimate of the scanner-related noise by analyzing the 
signals derived from the "dark level", i.e., that portion of the data generated 
while scanning the dark interior of the scanner housing, Since here the 
illumination is zero, any variation in the signal is due just to scanner system 
noise. The calculated noise from the dark level and from the water signature 
are presented in Table 2,7. To calculate the standard deviations reported in 
Table 2.5, the dark level of 1000 consecutive scan lines were analyzed . For 
the thermal band, the noise on the cold reference plate was used instead. It 

*The original flight plan had been for data collection from ari altitude of 
5000 ft. , however haze over the site necessitated collection at a lower altitude. 
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TABLE 2.5. M-7 SIGNAL CHARACTERISTICS: DYNAMIC RANGE 



Total Available Range 

• 512 Counts 

CHANNEL 

DYNAMIC RANGE^ 

DYNAMIC RANGE^ 

1 

71-284, 213 

90-137, 47 

2 

65-301, 235 

89-155, 66 

3 

64-388, 304 

105-186, 81 

A 

76-369, 193 

110-185, 75 

5 

59-317, 278 

100-165, 65 

6 

57-318, 261 

91-159, 68 

7 

63-343, 280 

94-192, 98 

8 

56-362, 306 

73-196, 123 

9 

10-208, 198 

94-158, 64 

10 

10-207, 197 

109-166, 57 

11 

12-270, 258 

109-176, 69 

12 

60-287, 227 

100-210, 110 


^Used Continuous Rule 

2 i 

Used lOZ Rule 

I* 

i 
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TABLE 2.6. M-7 SIGNAL CHARACTERISTICS: SIGNAL :NOISE 


CHARNEL 

WATER 

MEAN 

STANDARD 

DEVIATION 

SIGNAL :N0ISE 

RANGE :N0ISE 

1 

102 

5.0 

20.4 

9.4 

2 

103.5 

5.6 

18.5 

11.8 

3 

120 

6.2 

19.4 

13.1 

4 

111 

5.6 

19.8 

13.4 

5 

88 

5.6 

15.7 

11.6 

6 

86 

5.1 

16.9 

13.3 

7 

88 

6.7 

13.1 

14.6 

8 

68 

6.7 

10.1 

18.4 

9 

. 21 

2.9 

7.2 

22.1 

10 

19 

2.8 

6.8 

20.4 

11 

24 

6.3 

3.8 

11.0 

12 

84 

4.3 

19.5 

25.6 
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TABLE 

2.7, M-7 

NOISE CHAEACTERISTICS FROM DARK 

LEVEL SIGNALS 

CHANNEL 

MEAN 

STANDARD 

deviation 

DARK LEVEL 
SIGNAL :NOISE 

WATER 

SIGNAL :N0ISE 

1 

26 

1.3 

20.0 

20.4 

2 

22 

1.2 

18.3 

18.5 

3 

21 

.53 

39.6 

19.4 

4 

28 

1.3 

21.5 

19.8 

5 

30 

1.2 

25.0 

15.7 

6 

27 

1.2 

22.5 

16.9 

7 

58 

1.5 

38.7 

13.1 

8 

26 

.9 

28.9 

10.1 

9 

52 

1.5 

34.7 

7.2 

10 

39 

1.0 

39.0 

6.8 

11 

30 

1.5 

20.0 

3.8 

12* 

31 

4.1 

7.6 

19.5 


^Results for this band were analyzed from cold reference 
plate signals. 
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TABLE 2.8. DISTANCE BY CHANNEL BETWEEN A CORN AND 
A WOODLOT distribution FOR M-7 DATA 


CHANNEL 


1 

6.02 

2 

7.1 

3 

2.6 

4 

9.1 

5 

7.3 

6 

8.0 

7 

8.7 

8 

9.5 

9 

10.4 

10 

5.2 

11 

3.7 

12 

6.0 
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can be seen that in some bands the noise measured from the water signature 
is much greater than the dark level noise. So it seems that using the water 
signature as the noise measurement results in an overestimation of the 
scanner noise. 

The two selfsame corn field and woodlot areas which were used for the 
S— 192 part of this analysis were located in the M— 7 data set . Signatures 
were calculated and the same distance measure used previously was also calcu- 
lated here. Table 2.8 below presents the results of this analysis. On the 
whole the two distributions appear to be very well separated in this data. 

Comparisons of the two scanners for dynamic range, noise and signature 
separability will be made in the following section* 

2.3 COMPARISON OF S-192 AND M-7 SIGNAL CHARACTERISTICS 

The purpose in comparing data characteristics of the M-7 and S-192 multi- 
spectral scanners is not to prove one better than the other, but rather to 
better understand the capabilities of the new S-192 scanner . The M-7 has been 
widely used for several years and its capabilities and performance are well 
known while, on the other hand, the S-192 is only the second experimental 
spaceborne multispectral scanner and its performance and capabilities are 
unknown. By comparing these two scanners, we hope to better understand the 
S-192 and perhaps be able to suggest improvements or refinements for the 
next generation of spacecraft scanners. 

Briefly, with reference to Tables 2.1 through 2.8, it is seen that the 
dynamic range of the two data sets is very different, especially that the 
S-192 data range in the better channels is no more than 5 bits. Also, looking 
at the dynamic range in relation to the level of noise (as expressed in statis- 
tics over bodies of water) , the S-192 data range :noise is a quarter or a third 
that of the M-7 data. 

As for the separability of ground classes of interest, this was investigated 
by determining the separability of two specific fields, one corn and one woodlot, 
which were scanned in both of the data sets . 

A comparison of Tables 2.3 and 2.8 Shows that, on the whole, the separability 
for the M-7 data is much greater than that for the S-192; one should remember in 
maUng the comparison that, because the measurement is of two Gaussian dlstribu- 
tlons and is given in terms of a , that the actual probability of mlsclasslflca- 
tion declines exponentially at a rapid rate as the distance slowly Increases. 
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A further, finer comparison can be based on the fact that three of the 
S-192 and M-7 spectral bands are very similar. This identification was made 
upon inspection of spectral response curves for the two scanners. These 
bands are listed in Table 2.9 

Of the three bands treated here, two are in the near-infrared and one in 
the visible (green) portion of the spectrum. For these three bands we can 
compare, in Table 2.9, the key quantities calculated for each scanner. 

TABLE 2.9. BAND TO BAND COMPARISON OF M-7 AND S-192 RESULTS 





X 

DYNAMIC 

RANGE 

RANGE: 

NOISE 

SIGNAL: 

NOISE 

CORN-WOODLOT 

SEPARABILITY 

_(P,) 

S-192 

Band 

4: 

.54 - .59 

13 

5.4 

10.5 

0.06 

M-7 

Band 

6: 

,55 - .60 

68 

13.3 

16.9 

8.0 

S-192 

Band 

10: 

1.15 - 1.28 

28 

4.3 

3.4 

0.20 

M-7 

Band 

10; 

1.00 - 1.50 

57 

20.4 

6.8 

5.2 

S-192 

Band 

11: 

1.55 - 1.73 

19 

5.8 

4.4 

3.0 

M-7 

Band 

11; 

1.50 - 1.80 

69 

11.0 

3.8 

3.7 


From the above comparisons, it is clear that the S-192 data has a very 
limited range of data values, especially in relation to system noise. This 
small range; noise in turn severely inhibits the separability of classes of 
interest. Such problems with S-192 data appear to b« due, in some part, to the 
effects of the atmosphere on radiation sensed by the scanner. In general, the 
atmosphere reduces data contrast and, in this Instance, with a variable haze 
covering the test site area, the effect was more pronounced. Another factor 
which seems to make a difference between the two data sets is that on the one 
hand full use of available signal range on the M-7 was achieved by manual 
intervention both during data acquisition and during the digitizing process, 
while on the other the S-192 system was ubiquitously set to handle surface 
radiance over a wide range of atmospheric and ground conditions resulting in 
a very limited available range for any particular instance. Perhaps a more 
versatile acquisition system design would have upgraded the S-192 performance. 
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3 

PROCESSING RESULTS FOR THE AGRICULTUML TEST SITE 

SKYLAfi S-192 data over Southern Michigan was processed for two sites, 
showing two different applications of multispectral data. The first 
application was in the performance of an agricultural survey over the 
Southeast Michigan EREP test site. Both SKYLAB S-192 and aircraft M-7 
scanner data were collected over the area and are discussed in Sections 3.1 
and 3.2, respectively. The second, a land use evaluation for the urban- 
rural area around Lansing, Michigan, is detailed in Section 5. 

3.1 PROCESSING RESULTS FOR S-192 AGRICULTURAL DATA 

3.1.1 SIGNATURE EXTRACTION 

The agricultural test site, detailed in Appendix III, comprised 90 
sections (each about 1 mile square) in Ingham County, Michigan. The process 
of field location and identification was accomplished using a semi-automated 
technique described in Appendix VI. Briefly, field vertices were digitized 
from large-scale photography and transformed to data scan line and scan point 
coordinates. The same procedures used to identify pure field center pixels 
within each of the ground truth fields as described in Section 2.1.4, were 
applied in order to identify the field center pixels for each of the 90 
ground truth sections. Table 3.1 shows the classes in the scene along with 
the number of field center pixels identified for those classes; the notation 
of local (north 40 sections) and non-local (south 50 sections) sections of 
the ground truth area refer to the manner in which the area was divided 

for training and testing purposes. 

Some of the class names require explanation. Unplanted farm areas 

were listed in the ground truth as any of these categories: sod, grass, 

clover, grassy weeds, weeds, pasture, fallow and stubble. These were 
deemed to be similar in character differing only in the proportion of 
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TABLE 3.1 NUMBER AND DISTRIBUTION OF FIELD CENTER PIXELS 
fflTMgRBOF FIELD CENTER PIXELS FOR CLASS 


CLASS 

local AREA 
mORTH 40 SECTIONS) 

NON-LOCAL 
r SOUTH 50 SI 

CORN 

344 

549 

TREES 

24 

260 

BRUSH 

68 

39 

SOYBEANS 

19 

52 

ALFALFA 

23 

20 

GRASS 

398 

264 

STUBBLE 

53 

71 

BARE SOIL 

38 

43 

URBAN 

69 

0 

FIELD BEANS 

g 

56 

TOTAL 

1036 

1307 


ground covered by the vegetation, and hence were lumped together in the 
category of forage. The category of trees was deemed to be dense stands 
of mature trees, while the term brush was used in the ground truth to 
indicate scrub forest, some less dense tree stands, and brushy areas. ^ 

Training was carried out using only pixels from the north 40 sections. 
The use of the north area for training rather than the south area was an 
arbitrary choice. The use of 40 sections, rather than some subset of the 40, 


*The signature extraction, classification and assessment of results for 
the S-192 data reported in this section was carried out under Contract 
nL 9-13332, a subcontract involving the performance of S-192 data processing 
for Michigan State University. [5]. 
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was based on the desire to obtain training statistics for as many of the 
classes in the scene as possible and also to have a large number of pixels 
for each class so as to more accurately estimate the training statistics 
for each class. In extracting the training statistics, rather than lumping 
all pixels of common class together and thereby calculating one set of 
statistics for a class, we Implemented a supervised clustering approach. 

In this manner, all the field center pixels of each class were clustered 
independently. Thus if a ground class, which is basically an economic 
distinction, varies physiologically so that several spectral signatures are 
necessary to fully represent the class, this method will yield a better set 
of training signatures. Table 3.2 lists the clusters obtained from this 
procedure. 

TABLE 3.2 DERIVED TRAINING CLUSTERS FOR S-192 AGRICULTURAL DATA SET 


CLUSTER 

NUMBER OF PIXELS 

CORN 1 

134 

CORN 2 

28 

CORN 4 

129 

CORN 5 

28 

ALFALFA 

20 

TREES 3 

12 

TREES 4 

10 

BRUSH 

55 

SOYBEANS 

18 

BARE SOIL 

20 

CLOVER 

10 

STUBBLE 

32 

NEEDS 

43 

GRASS 1 

27 

GRASS 2 

17 

GRASS 3 

79 

GRASS 6 

22 

PASTURE 7 

50 

PASTURE 8 

49 

PASTURE 9 

20 

PASTURE 10 

20 

URBAN 1 

29 

URBAN 2 

14 

URBAN 3 

12 
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The signatures extracted in this manner were 12~channel signatures* 

SDO 18 (.45-. 50 lira) was dropped from the processing at this point because 
besides containing no information (see Section 2), its wildly fluctuating 
data values were causing confusion in the analysis. 

3.1.2 SIGNATURE ANALYSIS AND SELECTION OF OPTIMUM BANDS 

The set of signatures was first analyzed to see if any of the apparent 
spectral subclasses were due solely to effects of the noisier channels. 

At the same time, they were examined to determine if some of the signatures, 
for the same class, might be combined. Since the cost of classifying a 
data set is directly related to the size of the signature set, it is 
Important to reduce the size of the signature set whenever possible . 

As a first step, all 24 signatures were input to program STEPL which 
calculates optimum channels. Care was taken so intra-class differences 
were ignored; the channels were selected on an inter-class basis only. The 
results, as shown in Table 3.3 below, indicate that three of the four bands 
(SDO’s 6, 21, 22) identified Initially as noisy or having poor signal 
quality, were also identified by the program as the least useful in dis- 
criminating among the object classes. The fourth band previously identified 
as too, noisy to use, SDO 8 (.67-. 73 pm) , which covers the region of chloro- 
phyl absorption and is thus a key band in the discrimination of vegetation 
class, was deemed by this program to be of use. This is perhaps so because 
the separation of classes in this band is still greater than the noise 
content of the band. 

Next, with the aid of one- and twb-channel signature plots and outputs 
from computer programs which measure pairwise probabilities of misclassif ication 
and also estimate theoretical performance matrices for sets of signatures, the 
signature set was reduced from 24 to 15 signatures. Six of the signatures 
were simply dropped: the three urban signatures because they were deemed to 
be primarily mixtures of grass, soil and trees and also because there were 
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TABLE 3.3 SDO RANKING BASED ON OPTIMUM BAND CRITERIA 


RANK 

1 

2 

3 

• 4 

5 

6 

7 

8 

9 

10 

11 

12 

SDO # 

19 

12 

2 

8 

10 

17 

20 

6 

4 

14 

21 

22 

X(Mm) 

.93- 

1.05 

1.5- 

1.7 

.50- 

.55 

.66- 

.73 

.770 

.89 

1.15- 

1.28 

1.03- 

1.12 

.60- 

.65 

.54- 

.60 

2.1- 

2.34 

10.2- 

12.5 

.40- 

.45 

CUMULATIVE 
PAIRWISE 
PROBABILITY OF 
MISCLtiSSIFICATION 

.21 

.10 

.07 

.056 

.048 

.039 

.034 

.031 

.028 

.026 

.024 

.022 


no other urban features in the test site to test them on, the clover 
signature because it was very similar to some of the grass signatures and 
there was no other clover in the test site, and the stubble and Pasture 
7 clusters were found to be redundant with some or the grass clusters. 
Signatures combined on the basis of spectral similarity were 2 Pasture 8 
with Pasture 10; Grass 6 with Pasture 9; and Grass 3 with Weeds. 

This reduced set of signatures was input to Program STEPL for a final 
calculation of optimum bands. As reported in Table 3.4, the rank ordering 
is almost the same although the pairwise probability of misclassification 
has increased slightly, due to the several combination signatures in the set. 
Seven bands were selected for processing using a rule of thumb which says to 
select n channels where the decrease in the probability of misclassification 
• is less than .005 between n channels and n+1 channels. 

3.1.3 CLASSIFICATION RESULTS OF S-192 AGRICULTURAL DATA SET 

The signature set described in the previous sections was applied to 
classify all 90 sections in the agricultural data set. Two bases for evalua- 
tion were used to. analyze the results. The first basis was the pixel-by-pixel 
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TABLE 3.4 FINAL SELECTION OF OPTIMUM BANDS 


RANK 

1 

2 

3 

4 

5 

6 

7 

8 

SDO 

12 

19 

2 

10 

17 

8 

20 

4 

X(vna) 

1.50- 

1.70 

.93- 

1.05 

.50- 

.55 

.770 

.89 

1.15- 

1.28 

.660 

.73 

1.03- 

1.19 

.54- 

.60 

PROBABILITY OF 
MISCLASSIFICATION 

.21 

.11 

.088 

.070 

.058 

.050 

.043 

.039 


classification results for pixels of known class, i.e., the previously 
identified field center pixels. The second was an analysis of proportion 
estimation as taken from aggregated classification counts, which provides 
a more overall evaluation of the classification results. 

Tables 3.5 and 3.6 present performance matrices for just the field 
center pixels in the north and south areas, respectively. The bottom lines 
of the tables show the total proportion of field center pixels classified 
to each recognition class, and present the ground truth proportion for 
comparison. Finally, estimates of the overall classification rates are 
offered. 

Examination of the performance matrices shows that overall performance 
is only fair. One major problem is the high percentage of corn pixels 
being classified as trees/brush, as well as a large number of other pixels 
being classified as corn. The trees/brush classification is especially 
disappointing. Investigation of this showed the classes to be, simply, very 
similar spectrally. Some of the other apparently false recognitions are not 
entirely spurious. Several of the stubble pixels could indeed be bare or 
almost bare soil for example, or some of the brush pixels might be weedy or 
pasture spots in low density brush areas. In comparing the south area to 









TABLE 3. .5 PERFORMANCE MATRIX FOR CLASSIFICATION OF FIELD CENTER PIXELS 
FROM NORTH 40 SECTIONS 


PERCENT OF FIELD CENTER PIXELS ASSIGNED TO RECOGNITION CLASS: 


GROUND TRUTH 
CLASS 

NO. 

PIXELS 

CORN 

FORAGE 

CORN 

344 

73.0 

6.4 

FORAGE 

(GRASS 398) 
(ALFALFA 23) 
(STUBBLE 53) 

474 

8.9 

(7.3) 

(21.7) 

(15.1) 

81.4 

(83.7) 
(69.5) 

(69.8) 

TREE/BRUSH 
(TREES 24) 
(BRUSH 68) 

92 

26.1 

(4.2) 

(33.8) 

17.4 

(20.8) 

(16.2) 

BARE SOIL 

38 

13.2 

7.9 

SOYBEAN 

19 

31.6 

10.6 

TOTAL 

967 

33.9 

44.4 


tpj:e/brush 

BARE SOIL 

SOYBEAN 

UNCLASSIFIED 

18.1 

0.3 

1.7 

0.6 

3.8 

3.6 

1.7 

0.6 

(4.6) 

(2.5) 

(1.3) 

(0.8) 

(0.0) 

(0.0) 

(8.7) 

(0.0) 

(0.0) 

(13.2) 

(1.9) 

(0.0) 

51.1 

0.0 

0.0 

5.4 

(75.0) 

(0.0) 

(0.0) 

(0.0) 

(42.6) 

(0.0) 

(0.0) 

(7.4) 

0.0 

79.0 

0.0 

0.0 

0.0 

0.0 

57.9 

0.0 

13.1 

5.0 

2.6 

1.0 

9.5 

3.9 

2.0 

0.0 


GROUND TRUTH (%) 35.6 49.0 

RMS Error In Proportion Estimation (%) = 2.57 (Excluding Urban) 

Overall Percent Correct Classification of Pixels = 75.0% (Excluding Urban) 



TABLE 3.6 PERFOBMANCE MATRIX FOR CUSSIFICATION OF FIELD CENTER PIXELS 
FROM SOUTH 50 SECTIONS USING SIGNATURES OBTAINED FROM 
40 NORTHERN SECTIONS 


GROUND TRUTH 
CLASS 


NO. 

PIXELS 


PERCENT OF FIELD CENTER PIXELS ASSIGNED TO RECOGNITION CLASS: 

RN FORAGE TREE/BRUSH BARE SOIL SOYBEAN UNCLASSIFIED 


FORAGE 

(GRASS 264) 
(ALFALFA 20) 
(STUBBLE 71) 

355 

23.9 

(21.6) 

(80.0) 

(16.9) 

68.7 

(74.3) 

(20.0) 

(62.0) 

2.0 

(2.7) 

(0.0) 

(0.0) 

3.9 

(0.0) 

(0.0) 

(19.7) 

29.0 
(1.5) 
(0.0) . 
(1.4) 

0.0 

(0.0) 

(0.0) 

(0.0) 

TREE/BRUSH 
(TREES 269) 
(BRUSH 39) 

308 

31.5 

(32.7) 

(23.1) 

12.3 

(8.6) 

(38.5) 

51.9 

(55.8) 

(25.6) 

0.0 

(0.0) 

(0.0) 

2.6 

(1.1) 

(12.8) 

1.6 

(1.9) 

(0.0) 

BARE SOIL 

43 

4.7 

30.2 

2.3 

62.8 

0.0 

0.0 

SOYBEAN 

52 

15.4 

65.4 

0.0 

0.0 

19.2 

0.0 

FIELD BEAN 

56 

67.9 

28.6 

3.6 

0.0 

0.0 

0.0 

TOTAL 

1363 

47.5 

28.6 

18.1 

3.0 

2.4 

0.4 

GROUND TRUTH (%) 


40.3 

26.0 

22.6 

3.2 

3.8 

4.1 


RMS Error in Proportion Estimation (%) - 3.97 

Overall Percent Correct Classification of Pixels = 63.0% 
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the north, it is seen that the forage subclasses in the south area dis- 
tinctly fall off in terms of classification accuracy, with most of the 
incorrectly classified pixels being called corn and soybeans . Soybean 
recognition also suffers. However, it was noticed in the aerial photo- 
graphy that soybeans were a highly variable ground cover for the data set 
at this time of year. There are probably too fevj soybean pixels in the 
sample to give an accurate accounting of the classification performance. 

The analysis of the proportion estimation results for both north and 
south areas are presented in Table 3.7. These results are the classifi- 
cation counts over all pixels in each area. Given in the table for each 
area and ground class is the ground truth proportion; the proportion of 
pixels in the area classified as that class; and the EMS error of the 
estimate. 

The striking features of this table are that corn is overestimated, 
and that the error rate is larger in the nonlocal (south) area. 

Conclusions regarding these classification results will be given in 
Section 3.3, where comparison can be made with results obtained from pro- 
cessing aircraft scanner data from the same site. 

3.2 RESULTS OF PROCESSING M-7 AGRICULTURAL DATA SET 

For purposes of comparison with the S-192 data processing results, 
training and classification was carried out for the M-7 acquired data for 
a small 1.5 square mile area. The area selected for training and testing 
was located at mile three from the beginning of the flight line, line 2. 

(A complete description of the data, digitization and preprocessing is 
given in Appendix V.) This region was chosen because it was the first 
area in the data set which contained several large contiguous areas of 
corn, soybeans, woodlots and bare soil. All fields within the area were 
identified and used in this exercise. The classes, number of fields and 
number of pixels for each class appear as part of the results in Table 3.9. 
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TABLE 3.7 GROUND TRUTH PORPORTIONS AND RECOGNITION 

ESTIMATES FOR LOCAL (NORTH 40) AND NONLOCAL 
(SOUTH 50) RECOGNITION OVER LARGE AREAS 


North 40 Area South 50 Area 


Ground Cover 
Class 

Ground 

Truth 

Recognition 

Counts 

* * 
RMS Error 

bv Class 

Groiind 

Truth 

Recognition 

Counts 

RMS Error 
bv Class 

Corn 

26.5% 

36.8% 

13.8 

33.3% 

48.0% 

17.0 

Trees /Brush 

17.2 

14.3 

7,3 

16.5 

13.3 

7.2 

Forage 

47.4 

40.5 

9.7 

35.5 

30.9 

11.0 

Bare Soil 

7.2 

5.4 

4.4 

7.2 

3.3 

7.4 

Soybeans 

3.7 

2.4 

5.0 

4.0 

, 4.4 

5.6 

Other 

3.1 

0.4 

5.9 

4.7 

0.0 

7.8 


RMS Error"*" 4.66% 


6.89% 


RMS error was calculated as: E. 


RMSj 


(n 


1/2 


for: j ~ Class j 

n = Number of agricultural sections used 
Py - True proportion of class j in Section i 

p^^ = Estimated proportion of class j in Section i 


^RMS error was calculated as: E. 


RMS 


• (= j. 


1/2 


for m classes and j, and as above. 
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3.2.1 TRAINING PROCEDURES 

Initial training for the M-7 data was accomplished by using an unsuper- 
vised clustering technique to process every ninth pixel in the area. This 
technique yielded 59 clusters. The output graymap of cluster assignments 
was examined and an association vaa established between clusters and actual 
ground covers. It was shown that four major object classes (corn, soybeans, 
trees, and hay) were represented by very few clusters, while the various 
other ground covers such as weeds, bare soil, cu.t hay, senescent vegetation, 
pastures, farmsteads, etc., which display a wide degree of variability, 
were represented by 85% of the clusters. By examining the statistics for 
the cluster groups we were able to generalize the larger of these clusters 
into eight broad classes, as noted in Table 3.8. 


TABLE 3.8 COMBINING CLUSTERS BASED ON REPRESENTING 
COMMON OBJECT CLASSES 


Class 


No. of Clusters 


Total No. 
of Points 


1. Corn 

2. Soybeans 

3. Trees 

4. Hay 

5 . Sparse 
Vegetation 

6. Grass 

7. Bare Soil 

8. Dark or Wet 
Bare Soil 


2 

3 

3 
1 

8 

4 
9 

6 


2006 

217 

566 

1771 

252 

889 

305 

301 


TOTAL 



6307 
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Next the statistics (means and standard deviations) for the clusters 
in each group were combined to yield one signature for use in classification 
processing. In order to reduce classification costs, it was necessary to 
combine the clusters so as to greatly reduce the number of training signatures 
used in classification processing. Also, it was felt that for this data no 
loss of accuracy would result since it appeared from our analyses that there 
was very little overlap between class groups of clusters. As an additional 

safeguard, the program which calculates the new signature first performs a 
2 

X test on each signature to measure its distance (in a probability sense) 
to the mean of the other signatures in the group and rejects signatures if, 

the distance is too large. 

The subset of seven bands chosen for processing this set of data are 
as follows (listed in order of increasing wavelength): 2 (.46~.49 ym) ; 

3 (.48-. 52 ym); 7 (.58-. 64 ym); 9 (.67-. 94 ym) ; 10 (1.0-1. 4 ym) ; 

11 (1.5-1. 8 ym); and 12 (9.3-11,7 ym) . 

The data were then classified and evaluated. It was found that the 
overall classification rates were only fair and there was a major problem 
with tree false alarms in corn fields and also corn false alarms in tree 
areas. Further tests showed that these problems were not a result of having 
.combined the individual clusters -- in fact classifying with the separate 
corn and tree clusters produced slightly poorer results. 

As a final Investigation, ’'classical** training techniques, that is 
calculation of a set of training statistics for each individual field using 
all the pixels in that field, were used for all corn fields, soybean fields^, 
and woodlots. The set of signatures for each class were combined, after 
* first omitting "outlying** signatures , (signatures ^ose mean was further 
than a specified distance from the mean of the combined signatures). It 
was found that signatures thus discarded were from anomalous fields -- so-called 
tree areas which were pasture with some trees, a soybean field that was very 
weedy or uneven in ground cover, etc. The final set of eight signatures 
therefore included five signatures derived from clustering and three 
signatures derived from the more "classical" approach. 
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Results using this set showed a marked increase in correct classifi- 
cation and are presented in Table 3.9. It was noted that the tree-corn 
confusion problem, though still evident, involved significantly fewer 
pixels than previously. In comparing the recognition map to aerial photo- 
graphy, many cases were noted where apparently incorrect classifications 
in a corn or soybean field, for example, were matched with spots of dead 
crops or weeds in the fields. Thus we arrive at a problem in trying to 
assess classificaf^ on results using only classification counts: nonhomo- 

geneous recognition in a nonhomogeneous area thought to be homogeneous, 
is likely to be correct classification. Therefore, it is believed that 
the numbers displayed are an understatement of the correct classification 
rate. Also pertaining to the interpretation of Table 3.9 is the observation 
that more anomalous ground covers, such as weeds or pasture, would be 
correctly classified if called any one of a number of training classes 
such as weeds, sparse vegetation or hay. 

Because of time limitations we were not able to classify the entire 
data set to compare with the S-192 data set. 

3.3 COMPARISON OF CLASSIFICATION RESULTS 

Examination of the S-192 and M-7 classification results. Tables 3.5, 

3.6 and 3.9, shows that the M-7 classification was substantially better 
than that accomplished using the S-192 data especially as regards tree 
recognition. This is not too surprising considering the problems caused 
in the S-192 data by coarser resolution cell size and atmospheric effects 
due to the longer path length for reflected radiation to reach the SKYLAB 
sensor. Also, five of the seven bands used in processing the M-7 data 
were not useable in the S-192 data set due either to excessive noise or 

to limited dynamic range for the data in those bands. Thus it is 
perhaps unfair to compare results obtained from the two sensors. 
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TABLE 3.9 PERFORMANCE MATRIX FOR M-7 MULTISPECTRAL SCANNER CLASSIFICATION 
OF TRAINING AREA FOR FIELD CENTER PIXELS 


GROUND TRUTH 
CLASS 

# FIELDS 

# PIXELS 

CORN 

SOY- 

BEANS 

TREES 

HAY 

GRASS 

SPARSE 

VEG. 

LIGHT 

BARE 

SOIL 

DARK 

(WET) 

SOIL 

UNCLASSIFIED 

Corn 

10 

5767 

85,0 

0.9 

6.4 

4.3 

3.2 

0.5 

.0 

.0 

.0 

Soybeans 

6 

2248 

4.5 

77.0 

3.2 

2.9 

12.4 

.0 

.0 

.0 

.0 ■ 

Trees 

8 

2139 

1,6 

0.3 

88,9 

4.5 

1.8 

0.4 

2.2 

0.3 

.0 

Hay 

7 

3379 

2.5 

5,2 

4.8 

86.9 

0.5 

0.1 

.0 

.0 

.0 

Weeds 

6 

4371 

10.3 

.3 

5.9 

22.7 

22.0 

5.1 

25.5 

2.8 

.0 

Pasture 

5 

1524 

8.7 

.0 

,5 

1.3 

79.7 

8.6 

1.0 

.1 

.0 

Pasture/Woods 

8 

820 

9.6 

2.4 

38.4 

14.5 

31.2 

3.3 

.5 

.0 

.0 

Alfalfa 

1 

119 

.0 

72.3 

.0 

27,7 

.0 

.0 

.0 

.0 

.0 

Grass 

2 

394 

5.3 

0.8 

6.3 

0.8 

79.4 

2.5 

4.3 

0.5 

.0 

Bare Soil 

7 

1741 

1.1 

0.1 

0.8 

.0 

5.0 

11.1 

24.2 

57.6 

.0 

Field Beans 

2 

371 

10.2 

.0 

3.8 

^1.1 

14.6 

66.3, 

1.1 


.0 

TOTALS 

62 

22873 

25.5 

9.1 

13.6 


39.6 


12. 

2 

.0 

GROUND TRUTH 

(%) 


25.2 

9.8 

9.4 


48.0 


7. 

6 

.0 


Overall Correct Classification =84.1% 

RMS Error in Proportion Estiiaation =4.7% 
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However, an illustration which would point up the differences in the 
data collected, and hence the differences in the performance of the scanners 
could be effected via a comparison of the manner in which signals from the 
various ground classes filL up the signal space. This can be shown by 
using a sequence of two-dimensional ellipse plots. For simplicity, 
presented here are two such plots for each scanner. In each plot. Figures 
3.1, 3.2 for the S-192 data and Figures 3.3 and 3.4 for the M-7 data, 
the channels displayed are the best bands for discrimination. What is 
plotted is the two-dimensional contour ellipse for a chi-square value of 
one. The scale of the plots differs for the two scanners — the S-192 
plots are twice the scale of the M-7 plots. 

It is seen from comparing these figures that the S-192 data overlap 
a considerable amount, that is, the signals are compressed into a small 
portion of the available signal space. The ellipses shown are for a chi- 
square value of one meaning that only about 40% of the population of a two-^ 
channel distribution lies inside the ellipse as drawn (assuming the 
distributions to be Gaussian), It is readily apparent, then, that the 60% 
of the pixels outside the ellipse of the correct class will lie inside the 
ellipse of some other, probably incorrect class. It is surprising that the 
processing results were as good as they were. For the M-7 data, noting the 
change in the scale of the plots, it is seen that the ellipses are spread 
about a larger area of the signal space and are also somewhat distant from 
each other. The closeness of tree and corn distributions in each case 
indicate why this pair of classes was so troublesome . 

In conclusion, it has been shown that the limited range of the data or, 
viewing it another way, the compression of the signals into a small portion 
of the signal space, is responsible for the high confusion rate in classifying 
the S-192 data. 
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4 

EFFECTS OF CHANNEL-TO-CHANNEL SPATIAL MISREGISTRATION ON 
RECOGNITION ACCURACY OF SKYLAB S-192 DATA 


4.1 THE PROBLF.M 

The fact that Skylab S-192 data are spatially misregistered has been 
established. Scan-line-straightened data in particular is more severely 
misregistered than conic data as is described in Section 2.1.3. A significant 
issue to be examined here is whether this misregistration is a cause for 
concern with regard to the recognition accuracy achievable using these data. 

To address this problem two techniques were employed. The effects of channel- 
to-channel misregistration were examined analytically and through a simulation 
technique. Two experiments were designed to implement the simulation technique. 
One experiment concentrated on the effects of misregistration on field center 
pixels and a second experiment investigated the effects on border or mixture 
pixels. Though it was found that misregistration has an insignificant effect 
on the recognition accuracy of field center pixels, it was determined that the 
availability of these pixels was markedly reduced. That is, with the introduction 
of misregistration, fewer pure field center pixels exist. As a result, the 
classification of mixture pixels (pixels whose signals were derived from two or 
more ground covers) was an important concern. It was determined that the correct 
classification of mixture pixels deteriorated with the introduction of misregis- 
tration. Misregistration could adversely Influence the false alarm rate of 
ground classes which adversely affects the accuracy of standard proportion 
estimation techniques. 

4.2 THE APPROACH 

In order to facilitate the analysis of the effects of channel-to-channel 
spatial misregistration, S-192 resolution elements may be divided into four 
categories as illustrated in Figure 4.1. These are: (a) pure field center 

pixels can be misregistered but remain field center pixels; (b) pure field 
center pixels can be misregistered so those channel(s) out of registration 
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become mixtures of two or more crop types; (c) mixture pixels can be misregistered 
so channel (s) out of registration represent different mixture proportions; 
and (d) mixture pixels can be misregistered so those channel(s) out of registration 
become pure field center values* 

In the analysis of the effects of channel-to-channel spatial misregistration, 
pixels falling into category (a) were examined separately from those in (b), 

(c) and (d) . Two techniques were employed in the analysis of the effects upon 
pure field center pixels that are misregistered but remain field center in all 
channels (category (a)). The first, an analytical technique, examined a simplified 
data structure studying the effects of misregistration within a context of two 
signatures with a common covariance. The second technique employed was one based 
on the simulation of the effects of misregistration. A simulation was also carried 
out in the analysis of the effects on pixels of the above mentioned categories 
(b) , (c) and (d) . 

A simulation technique was decided upon in order to quantify in some 
manner the effects of misregistration on a given S-192 data set. Given a 
signatures set from registered data, the problem was to determine in what 
manner the signatures would be affected by the introduction of a known degree 
of misregistration. Signatures were to be simulated representing not only 
pure field center statistics of misregistered data, but also border pixel 
statistics. Signatures were manipulated rather than the actual data in order 
to simplify the amount of processing required. 

A subset of five signatures from the agricultural processing set used in 
Section 3 were used as the basis for the simulation. These signatures 
represented ground covers corn, tree, grass, bare soil and brush. The same 
seven bands of data previously selected were used here; these were SDOs (2, 8, 

10', 12, 17, 19, 20). It wa i assumed for purposes of simulation that data 
from which the signatures were generated were- perfectly registered from 
channel-to-channel. From these initial signatures many signatures were 
generated representing a variety of distributions as affected by varying degrees 
of misregistration. When more than one channel was misregistered in simulation 
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each was misregistered from the original set by the same degree. Two different 
sets of processing were carried out. One examined the effects of misregistration 
of three S-192 channels and the other was a simulation of the misregistration 
of a single S-192 channel. 

Once a variety of misregistered distributions were simulated, several 
analyses were carried out. These were (1) an analysis of the effects of 
misregistration on the expected recognition performance matrix of misregistered 
field center pixels, (2) an analysis of the expected classification performance 
for mixtures of two crops at varying degress of misregistration, and (3) an 
examination of the effect of misregistration on the availability of field 
center pixels . 

Presentation of the above analyses will first concern the effects of 
misregistration on field center pixels that remain field center in all channels 
even after misregistration and secondly the analysis of the effects of spatial 
misregistration on border and near border pixels will be discussed. 

4.3 THE EFFECT OF MISREGISTRATION ON RECOGNITION ACCURACY OF FIELD CENTER 

PIXELS THAT REMAIN FIELD CENTER IN ALL CHANNELS EVEN AFTER MISREGISTRATION 

The analysis of this section deals with an examination solely of field 
center pixels that remain field center in all bands even after misregistration. 

4.3.1 RESULTS OF THE ANALYTICAL ANALYSIS OF THE EFFECTS OF 
MISREGISTRATION ON FIELD CENTER PIXELS 

Insight was gained into what effects spatial misregistration may have on 
field-center recognition performance first through an analytical analysis of 
the problem. This analysis examined two normal distributions with common 
covariance for any number of channels of data. The conclusions of the analysis 
were intriguing. Where ^common sense’ might dictate the hypothesis that 
misregistration would hurt field-center recognition performance, the model 
studied indicated that quite the opposite could be true. Under certain 
circumstances misregistration could actually improve results in the classification 
of field center pixels. 
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Since misregistration and correlation are highly related, the analysis 
examined error rate of classification as a function of correlation (p) . It 
was determined that a unique maximum error rate is reached somewhere between 
-1 < p < 1. Figure 4.2 plots error rate ij) as a function of correlation p in 
a conceivable manner as determined by the analysis. Misregisterlng data will 
cause correlation to tend to zero. Therefore, should the given correlation A 
between the two stated distributions lie in the range 0 < A < 1 1 for 

perfectly registered data, then by misregisterlng the data the expected error 
rate would actually decrease in value. A full presentation of the analytical 
analysis is presented in Appendix IX, 

4 > 



. FIGURE 4.2 ERROR RATE OF RECOGNITION (|) AS A FUNCTION 
OF CORRELATION p IN FIELD CENTERS 


In order to test the hypothesis of the analytical analysis in a more realistic 
data processing situation where there are mote than two signatures, each with 
a distinct covariance matrix, a simulation model was developed to empirically 
analyze the effects of channel-to-channel spatial misregistration on the 
correct classification of field center S-192 resolution elements. 
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4.3.2 THE FIELD CENTER RESOLUTION ELEMENT MISREGISTRATION MODEL 

The simulation model presented in this section describes the effect of 
misregistration of field center pixels that remain field center pixels in all 
channels even after misregistration. The means and standard deviations of 
pure field center pixels are not affected by misregistration. Hence the 
model does not modify these statistics. Correlations are the statistics 
that are affected. 

Analyses were made by Horwitz [8] and Coberly [9] of the correlation 
between ground elements studied as a function of the distance between the ground 
elements. Though both were studies of aircraft data, conclusions were drawn 
for bANDSAT size resolution elements. They determined that the correlation 
between LANDSAT size pixels drops exponentially as the distance between 
the pixels increases. In effect, two adjacent LANDSAT or S--192-sized pixels 
are virtually uncorrelated. 

In effect a misregistered scanner channel is measuring a signal displaced 
from the center of focus of the registered channels. Hence the correlation 
between two channels which are not registered would be less than the corresponding 
correlation had both channels been registered. The above mentioned analyses 
indicate that pure field center signatures derived from misregistered data 
are less correlated in those channels out of registration than field center 
signatures derived from corresponding registered data. 

The model chosen to simulate this effect is one that estimates the 
decorrelation as a linear function of misregistration. This estimate is a 
more conservative measure of the effect than the previously mentioned 
exponential drop measured in aircraft data. However, since S-192 resolution 

is not as fine as the aircraft resolution considered, this more conservative 
estimate was deemed more appropriate. 

Given a perfectly registered distribution with mean Aj^ and covariance Cj^. 
For with some channel or channels misregistered, it would have the same 
mean vector A^^ but a different covariance C^, Any term of say j 
related to a term of in the following manner. 
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3 was simply chosen to equal the degree of misregistration between two 
channels. For example if two channels i and j were misregistered by one-half 
pixel with respect to one another, then the correlation between i and j, 

C... was simulated to be one-half the measured correlation between i and 1 
in the registered signatures, 

4,3,3 THE EXPERIMENTAL DESIGN 

Appendix VII describes the experiment carried out in full. For purposes 
of clarity the following experiment summary is presented. 

Five S-192 'field "center signatures representing the distributions of 
tree, corn, grass, bare soil and brush classes were selected for use in the 
implementation of the experiment. Using the simulation model discussed in 
the previous section, signatures representing field center distributions 
misregistered by factors of 1/3, 1/2, 2/3 and 1 whole pixel in the SDOs 2, 

12 and 17 were simulated (See Appendix I for wavelengths). These three SDOs 
were chosen because they were found to be the three best channels for purposes 
of discrimination for the given signature set. Thus we are calculating *?.n upper 
bound to the errors caused by misregistration. An expected performance matrix 
was calculated for each of the four sets of simulated signatures along with 
the original signature set using the program PEC described in Appendix XI. 

The same processing was carried out using the best channel for discrimination, 
SDO 12, as the only misregistered channel. This was an attempt to measure the 
sensitivity of classification results as a function of the number of channels 
misregistered. 
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4-3.4 RESULTS OF FIELD CENTER ANALYSIS 

The analytical analysis described in Section 4.3.1 concluded that channel- 
to-channel spatial misregistration would not necessarily cause field center 
classification accuracy to deteriorate. However, it did not provide a 
measure of just how sensitive classification performance on field center 
pixels might be to misregistration. Therefore, the simulation technique 
was employed in an effort to quantitatively assess a classifier's sensitivity 
to misregistration. Keep in mind that both the analytical analysis and the 
empirical evidence gathered from aircraft data pertain only to those field 
center pixels that remain field center after misregistration. 

Tables 4.1 and 4.2 display results calculated for simulated misregistrations 
of three channels and one channel, respectively. The row labelled ”0 pixels" 
represents the expected performance of the data set as is, without misregistration. 
The results displayed in these tables seem to support the hypothesis that 
misregistration need not be harmful to the recognition performance of field 
center pixels that remain field center after misregistration. Note that, in 
both Tables 4.1 and 4.2, the total expected classification for the given 
signature set diminishes slightly (by 0.22%) for misregistration of up to 
one-half a pixel but as more misregistration is introduced, the performance 
improves slightly (0.44 to 1.0%) above the beginning value. 

Examination of the simulation results on a crop-by-crop basis leads 
to further observations. First, not all the crops behaved in a like manner 
as misregistration was introduced. In Table 4.2 bare soil retained a somewhat 
constant expected performance whereas grass experienced a loss of .2% at 
$ = 1/3 and then steadily improved from 81.1% to 84. 2% at B =1. Corn, on 
the other hand deteriorated up to 3 = 1/2 and then improved. The expected 
recognition of trees deteriorated up to 3 = 2/3. Secondly, in comparing 
Table 4.1 and Table 4.2 on a crop-by-crop basis, one detects more sensitivity 
in the misregistration of one channel in more cases than in the misregistration 
of 3. Most pronounced is grass which improyed from 81.3 to 82.8 in Table 4^1, 
and from 81.3 to 84.2 in Table 4.2. Interestingly, corn deteriorated in Table 
4.2 up to 3 = 1/2, while it improved in Table 4.1. 
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TABLE 4.1. EXPECTED PERFOKMANCE OF S-192 SIGNATURES 
FOR VARYING DEGREES OF MISREGISTRATION OF 
SDO's 2, 12 and 17. 


3 


Degree of 
Misregistration 

Tr6e 

Exoected Recognition Accuracy:(/^) 

Grass Bare Brush Corn Overall 

0 pixels 

96.5 

81.3 

97.9 

77.2 

77.9 

86.16 

1/3 

96.3 

80.3 

98.2 

76.8 

78.2 

85.96 

1/2 

96.1 

81.1 

98.1 

76.0 

78.4 

85.94 

2/3 

96.2 

81.8 

98.1 

76.0 

79.0 

86.22 

1 

96.7 

82.8 

98.7 

76.4 

78.4 

86.60 


TABLE 4.2. 

EXPECTED 

PERFORMANCE OF 

S-192 SIGNATURES 


FOR VARYING DEGREES OF MISREGISTRATION OF 



SDO 12. 






3 

Degree of 
Mlsr egls tration 


Expected Recognition Accuracy 



Tree 

Grass 

Bare 

Brush 

Corn 

Overall 

0 pixels 

96.5 

81.3 

97.9 

77.2 

77.9 

86.16 

1/3 

95.6 

81.1 

97.7 

78.5 

76.7 

85.92 

1/2 

95.2 

81.5 

97.7 

79.0 

76.4 

85.94 

2/3 

95.2 

82.0 

97.6 

79.7 

76.8 

86.26 

1 

95V7 

84.2 

97.7 

80.9 

77.3 

87.16 
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Three conclusions cen be drawn from these results j (1) as was 
hypothesized, misregistration is not necessarily harmful to the recognition 
performance of field center pixels that remain field center in all channels 
after misregistration, (2) though results may both decay or improve, 
depending on the degree of misregistration, the expected performance of the 
classifier was found here to vary only plus or minus one percent of the 
total for registered data and at most three percent on a crop-by-crop basis, 
and (3) the sensitivity of the classifier to misregistration did not appear 
to be a function of the number of channels misregistered. In fact, more change 
in classification was detected with only one channel misregistered than with three. 
The fact that misregistration is expected to in some cases improve 
recognition accuracy among field center pixels should not suggest using 
misregistered data or actually misregistering data to improve recognition. 

Though the recognition of certain field center pixels may actually improve, 
evidence presented in the next section will indicate that other more serious 
problems are confronted with the introduction of misregistration. Deleterious 
effects can be detected among border pixels and field center pixels that are 
mixtures in the misregistered channels. 

4.4 THE EFFECTS OF CHANNEL-TO-CHANNEL SPATIAL MISREGISTRATION ON 
BORDER AND NEAR BORDER PIXELS 

This section deals with the category of pixels consisting of field center 
pixels that become mixture pixels in those channels that are misregistered as 
well as border or mixture pixels. Within this overall category the most 
deleterious effects of misregistration are encountered. 

4.4.1 THE AVAILABILITY OF PURE FIELD CENTER PIXELS 
Channel-to-channel spatial misregistration reduces the availability 
of pure field center pixels. Figure 4.3 displays several representations 
of three channel resolution elements . Pixel (a) is the appearance of a 
pixel registered in all channels. It appears as a pure field center pixel 
in cover type W. If this pixel were misregistered by one whole pixel in 
channel 2 in the right to left direction it would appear as (b). The pixel 
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TABLE 4.3. DISPLAY OF THE NUMBER OF PURE FIELD CENTER PIXELS 
AVAILABLE FOR VARYING DEGREES OF MISREGISTRATION 

NUMBER OF FIELD CENTER PIXELS AND PERCENT OF TOTAL FOR: 



TOTAL PIXELS 
INCLUDING 
MIXTURES 

NO 

MISREG- 

ISTRATION 

ONE-HALF 

PIXEL 

MISREGISTRATION 

ONE-PiXEL 

MISREGISTRATION 




'•% 


% 

a 

% 

CORN 

3641 

1526 

41.9 

1054 

28.9 

537 

14.7 

BRUSH 

820 

341 

41.6 

227 

27.7 

117 

14,3 

TREE 

490 

175 

35.7 

105 

21.4 


8.4 

GRASS 

2922 

1250 

42.8 

896 

30.7 

491 

16.8 

BARE 

653 

222 

34.0 

140 

21.4 

55 

8.4 

STUBBLE 

1081 

391 

36.2 

247 

22.8 

100 

9.3 

OTHER 

706 

296 

41.9 

209 

29.6 

119 

16.9 

TOTAL 

10313 

4201 

40.7 

285 3 

27.9 

1460 

14.2 


( 
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is still fully in Crop W but now channel two is detecting a signal displaced 
an entire pixel from the registered location. Pixels (c), (d) and (e) 
would all have been pure field center pixels had no misregistration been 
introduced. These pixels are now mixtures in channel 2 of covers W and 0. 

In fact, pixel (e) is 100% cover 0 in channel 2, whereas it is 100% cover W 
in channels 1 and 3. This effect of misregistration causes fewer pixels to 
be pure field center in all channels. 

Table 4.3 indicates for the given S-192 data set the availability of 
pure field center pixels as a function of the degree of misregistration. For 
a given misregistration 3, any pure field center pixel within 23 of the 
border lies within a sensitive region. The signals detected for these 
pixels will be mixtures in the misreglstered channels. Table 4.3 was 
calculated using only the larger fields (greater than 17 acres) from the S-192 
Southeast Michigan agricultural test site with a program designed to count field 
center pixels given a set of polygon field designations. To determine how 
many pure field center pixels would be available for a misregistration of 3, 
each field polygon was inset by 3 pixels and the available field center pixels 
counted with respect to the new field designation. 

It is obvious from Table 4.3 that the availability of field center 
pixels deteriorates rapidly with Increased misregistration. Column 2 
indicates the number of available field center pixels with perfect misregis- 
tration. The third column indicates that with the Introduction of 1/2 pixel 
misregistration along the scan line the total number of field center pixels 
diminishes by 1323 or 31.5%. Another 1/2 pixel misregistration reduces the 
total number of available pixels by another 34% from those available initially. 

The evidence of this analysis adds great weight to the need to study 
the effects of misregistration on mixture pixels. Though misregistration 
may have no significant effect on pure field center signatures as concluded 
in the previous section, the diminished existence of pure field center pixels 
makes both the extraction of creditable field center statistics more 
difficult and the analysis of the effects of misregistration on mixture 
pixels more significant. 
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4.4.2 THE SIMULATION MODEL DEVELOPED EOR BORDER AND NEAR BORDER PIXELS 

S-192 resolution elements lying on field boundaries are mixtures of two 
or more ground covers. Due to misregistration, certain channels of field 
center resolution elements may also represent mixtures of two or more covers. 
The concern was to develop a mathematical model that would enable an analyst 
to describe any distribution from a misregistered data set arising from 
mixtures of at most two crops, based on the signatures of the pure field 
center crops. The model developed incorporates features of the ERIM mixtures 
model • * 

An n-channel multispectral signature for material W consists of a mean 
vector A^ with components a^^ where i=l,..,n, and a covariance matrix 
with components c^^ ^ for each i=l,...,n and j=*l,...,n. 

Consider the case where the signal detected in one or more channels 
represents a mixture of ground cover W and some other ground cover 0. The 
following is the model used to construct the signature of mixture pixels 
from the pure signatures of W and 0. 

Let be the proportion of cover W present for each pixel and a^*=l-a^ 
the proportion of cover 0 present for each pixel. If the pixel were of pure 
cover W then a =1. A mean vector A of a mixture distribution of crop W 
and 0 consists of components: 


mi 


= “wl *wl 


oi 


(4.1) 


where i denotes the spectral channel. 

The definition of a term C , . of the variance-covariance matrix is: 

mx.j 


C . .=a.C. .+ (1-a .)C . . 
mi,j wi wi,j wi oi,j 

2 

Whenever i*i the channel variance term a . would be: 
■’ mi 


(4.2) 


2 2 2 
a . = a .a . + (1-a .) a . 
mx wi wi wi ox 


(4.3) 


Given any two distributions then, one can approximate mixture distributions 
in any proportion of the two crops using Equation (4.1) and (4.2). 


*The misregistration-mixtures, model discussed here was developed for 
NASA/ JSC under [17] and current contract NAS9-14123. 
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With the occurrence of spatial misregistration between channels, Eq. (4.2) 
should not be used to estimate covariance between channels that are not in 
registration with respect to one another. If the pixel in question had been 
a field center pixel of cover .W lying near the border or represents a mixture 
of covers W and 0 and if two channels, say i and j, are not in registration 
with respect to one another, then the following model can be used to 
approximate the distribution. 

Let be the proportion of cover W present for each pixel in channel 
i and the proportion of cover 0 present for each pixel in channel i. 

For the misregistered mean vector A^, use Equation (4.1). Then for the definition 
of a term C., , of the variance-covariance matrix, use: 


c„. . = min (a ,,a .) * c . . + min (a 
Mx,j ' wi* wj' wi,j oi, oj'' oi,j 

c... . = min(a .,a ,) *c. . + [l-max(a . ,a .)] * c . . 
Mi,j wi’ wj' wi,j ' wi’ wj ’ oi»j 


Whenever 1 = j , the variance tenn is given by 

*^i,± *^wl *^wi,i ^ ^wi^ %i,i 


(4.4) 


(4.5) 


(4.6) 


Letting 0^ represent the channel 1 variance, with appropriate subscripts 
we have: 


2 2 2 
o„. = a .0 .+ a .0 . 
Mx wi wi ox oi 


(4.7) 


This expression is equivalent to the mixture variance estimation model, 
Eq. (4.3). 
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Equation (4.5) describes in full the estimated covariance between any two 
channels of data that are being simulated under the stated model. Diagonal 
terms of the variance-covariance matrix (the channel variances) are described 
by Eq. (4.7). Let us here consider the correlation terms between channels 
in an attempt to more fully describe and justify the underlying assumptions 
made in arriving at this simulation model. 


(a) Perfectly Registered 

Other I Klieat 

I 



a . = 1 for 
wi 

all i 


I 


(b) Misregistered 
Other J Mieat 



“wi 
“W2 “ 
“W3 " 
“W4 “ 
“W5 “ 
“W6 “ 


Figure 4.4. Example of Channel Misregistration 

for a Single Resolution Element 


1 

1 

1/2 

1 

1/2 

5/6 


Figure 4.4 displays a possible configuration of the composite signal 
received by six different channels while focusing on a single resolution 
element. Figure 4.4(a) indicates that all six channels are focused on 
precisely the same location, a borderline resolution element of wheat. This 
indicates a perfectly registered vector of signals. Figurt 4.4(b) indicates 
a vector wherein channels 3, 5, and 6 were misregisteted and actually 
viewing mixtures of wheat and other. 
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Correlation terms between channels 1, 2 and 4 remain identical 

in Figure 4.4(b) to their calculated value for the case shown in Figure 4.4(a) 

It is also easy to see that the cross correlation between channels 3 and 5 is 

Identical to the mixture covafiance estimation model: whenever a , *= a . , 

wi wj ’ 

Eq. (5) becomes: 

^mi,j %± ^wi,j ^ %i * ^oi,j 

which is ERIM^s mixture model [15]. 

However, whenever ^ as is the case, for example, in channel 1 
versus 3 or 3 vs. 6, Eq. (4.2) addresses situations not previously considered 
by the mixture model and assumptions made in the evaluation of these 
covariance terms must be fully understood. 


(a) With Overlap 


Other \vneat 


oi' 


wi 


pi 




a . a . 
oj I WJ 

I 

I 


chanricl j 


(b) Total3.y Misregistered 
(No Overlap) 


y- * t -■» 


■ ' 'y’ 

J - . 

' ! - 

channel i 


Other 

VTheat 


%i 

“o3-0 





FIGURE 4.5 A MISREGISTRATION CONFIGURATION IN TWO CHANNELS 
FOR A SINGLE RESOLUTION ELEMENT 
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Eq. (4.5) Is: 


c . . = min (a ..a ,) * c . 
mi,j ' wx’ wj wi 


. + mln(a . ,a . ) * c . . 
,j ' oi* oj' oi,,i 


Figure 4.5(a) displays what the components of Eq. (5) are estimating. 

Note that a . = min (a . ,a .) gives the proportion of overlap (area shaded) 

between the two channels in the wheat field. Hence a . * c„_- ^ is the 

wj wx,j 

contribution of c . .to the constructed covariance term c ^ ,• Similarly, 
wi,j mi,j 

a = min (a . ,a .) is the proportion of the other field that is common to 
ox ox oj 

both channels i and j (area shaded) and a . * c , is the contribution 

o i o i , j 

of the covariance of ’other’ in channels i and j • Hence where there is 
no overlap, the cross correlation is assumed to be negligible and therefore 


The two basic assumptions made in the derivation of the covariance 

estimation model are (1) within the same field the correlation between two 

ground signals drops off rapidly as the distance between the signals increases 

and (2) signals from different crops are totally uncorrelated. Figure 4.5(b) 

illustrates the second assumption. Here the correlation c„. . = 0. Also 

Mi , J 

as seen in Figure 4.5 the contribution to the estimated correlation from the 
unshaded area is assumed zero. The only contribution is from the shaded area. 

/-^chauuel i 

•H 

d 

w p (0) (‘- — ^ ) estiLmated covariauca 

CL 

cs (• •) true covariance 


Misregistration (1-a.) 


FIGURE 4.6 ILLUSTRATION OF COVARIANCE AS ESTIMATED 
AND TRUE COVARIANCE 
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Figure 4.6 illustrates a comparison between the covariance estimated by 
the proposed model and a hypothetical true covariance. The difference between 
the two curves is due to both assumption (1) above and the fact that scanner 
noise and atmospheric noise contributions were not considered. When there is 
no misregistration, p(l— = p(0) and the estimate is exact. As misregistration 
increases some error is introduced. The analytical deviation of Eq. (5), based 
on the assumptions mentioned above, is presented in Appendix X. 

4.4.3 THE EXPERIMENT FOR MIXTURE PIXELS 

Appendix VIII describes the experiment carried out in full. Fur purposes 
of clarity, the following experiment summary is presented. 

In the analysis of the effects of channel— to-channel misregistration on 

mixture pixels, tw3 types of signature simulations were required. First, 

signatures representing field center distributions misregistered for factors 

of 1/2 and 1 whole pixel in SDOs 2, 12, and 17 were calculated. Another 

experiment was run in parallel with only one channel, SDO 12, misregistered. 

The second experiment is otherwise identical to the first and anlysls of 

the results of both are presented in the next sections. Once field center signatures 

were calculated, new distributions representing mixtures of all permutations of 

two ground covers for varying proportions were simulated as follows. Let 

a and a._ be the proportions of distributions A and B in the 1*"^ channel used 
iA iB 

to simulate a mixture of ground covers A and B. For perfectly registered 
signatures, was set to 2/3, 1/3 and 0 for every channel i. However for 
misregistered signatures, the channels out of registration would be in 
different proportions. For example, if a signature was misregistered by 1/2 
a pixel the proportion of cover type A would be - 1/2. Hence any field- 
center pixels in the registered case within 1/2 pixel of the boundary would 
become mixture pixels in the misregistered case. (In effect there would be 
fewer field center pixels). Therefore signatures representing mixtures of 
misregistered distributions were simulated with proportions of and 
in the registered channels i and 3) (“jg + 3) in the misregistered 

channels j, where 3 is the degree of misregistration. 
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Once the simulated signatures were attained, the program PEC (see 
Appendix XI (with a 0.001 probability of falsely rejecting a pixel from a 
multivariate gaussian distribution) was used to calculate the expected perfor- 
mance for each set of signatures representing a given misregistration case. 

That is, given the linear decision boundaries between the 5 field-center 
signatures, what will be the expected classification of mixture pixels. 

Analysis consisted of the study of the expected performance curve as a 
function of the location of the pixel across a field boundary. The study 
conducted centered on the analysis of three basic problems: (1) the effect 
of misregistration on the classification of a mixture pixel of two ground 
covers; (2) the effect of misregistration on the false alarm rate of any given 
crop among mixtures of two other ground covers; (3) the effect of misregir a 

on proportion estimation; and (4) effects as a function of number of chati*ie.*.o 
misregistered. These analyses are presented in the following sections. 

4.4.4 INTERPRETATION OF RESULTS 

In order to facilitate the discussion of the results it would be wise at 
this point to introduce the standard format of the graphs to be presented. 

These graphs were vital tools in most of the analyses carried out and it 
would be of invaluable aid to be fully at ease with their format. 

Each figure is composed of three graphs (see Figure 4.7 as an example) 
with each graph displaying one of the degrees of misregistration considered 
(0, 1/2, or 1 pixel). The curves display the expected performance of pixels 
of the types labelled at the top of the graphs, as a function of the proportion 
present of each of the two possible crop types. In a sense one could envision, 
as an aid in studying these graphs, a pixel moving across a fixed field boundary 
and at various locations the expected probability of that pixel’s classification 
would be calculated. Note in each of the following graphs a zone representing 
pure field center pixels in the registered case has been labelled as well as an 
an area representing mixtures of varying degrees. The width of these zones is 
exactly one pixel and the field boundary would appear as drawn. The right hand 
corner of a pixel placed on this grid would lie at the labelled mixtures proportion 
that it represents . 
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Primarily the presentation will center on the effects noted on brush and 
grass mixture pixels interacting with brush, grass, and corn signatures. Since 
corn is the major crop of interest in the scene, the analysis of the false 
alarm, rate will revolve primarily about the false alarm rate of corn. These 
crops were chosen for primary consideration since corn, grass, and brush 
comprise almost three— fourths of the scene. The effects of misregistration 
analyzed through the interaction of these crops is fairly typical of the entire 
study; it represents neither one extreme nor the other. Some consideration will 
also be given to interactions between other crops . 

4.4.5 DISCUSSION OF THE EFFECTS OF CHANNEL-TO-CHANNEL SPATIAL 
MISREGISTRATION ON BRUSH-GRASS MIXTURES 

Figure 4.7 displays three graphs, one for each degree of misregistration 
of the three SDOs considered, plotting the expected probability of classifying 
brush and brush-grass mixtures as brush (the solid line) or grass (the dashed 
line). In Figure 4.7(a), on top, one notes that in the area designated brush, 
these field center pixels are for the most part classified as brush. As the 
mixture of brush and grass becomes predominantly grass, Che performance curve 
increases for grass and decreases for brush. Also note in Figure 4.7(a) that 
at the border (1/2, 1/2)* mixture pixels are in proportion one-half grass and 
one-half brush and are called brush or grass 70% of the time. These pixels 
are thus incorrectly classified 30% of the time. As misregistration is intro- 
duced (compare Figures 4.7(a), (b) and (c)), field center brush pixels are not 
classified as brush with as much consistency. The expected performance for those 
pixels most near the border deteriorates from around 78% to 42% correct for one-half 
pixel misregistration, and down to 15% for one pixel misregistration. The 
indication is that misregistration does affect the correct classification of near- 
border and border pixels significantly. 

Figure 4.9 is a counterpart to Figure 4.7. Here misregistration is depicted 
from grass into brush. Again near-border grass pixel classification 
deteriorates, from 83% to 25% correct classification with one pixel misregistration. 
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Figure 4,8 displays the expected probability of classifying a brush or 
brush-grass pixel as corn. Even in the registered case, the corn false alarms 
among brush-grass pixels are significant. As misregistration is introduced, 
more and more corn false alarms occur among pixels that were pure field center 
brush pixels in the registered case. In fact those most near the border are 
called corn with up to 40% regularity. In view of this graph alone, one cannot 
dismiss the significant increase in corn false alarms introduced by misregistration 
of the data. Figure 4.10 acts as the counterpart for Figure 4.8 with misregistration 
from grass into brush. One notices that as misregistration increases more corn 
false alarms occur among otherwise pure grass pixels, however the rate decreases 
among grass-brush mixture pixels. 

These observations indicate that misregistration has a significant effect 
on the correct classification of mixture pixels. It was also evident in these 
and other graphs that are not presented that the corn false alarm rate was high 
among mixtures of different crops [see section 4.4.7]. Several observations were 
also made in examining the effects of misregistration as a function of the channels 
misregistered. Generally, . the recognition curves did not deteriorate as rapidly 
with only one channel misregistered. However , depending on the mixtures, some 
curves would deteriorate even more rapidly indicating a need for concern even 
though just one channel was improperly registered. The next set of curves to be 
presented. Figures 4^11 to 4.14 are the counterparts of Figures 4.7 through 
4.10, respectively, for the simulated misregistration of only one channel 

Figure 4.11 is a display of misregistration of SDO 12 from brush to grass. 

Near border pixels deteriorate from about 80% to 35% at the extreme of one pixel 
misregistration. This indicates a less rapid deterioration than in the case 
of Figure 4.7 where three channels were misregistered. 

Figure 4,12 is a display of the corn false alarm count for one channel 
misregistered. In contrast to Figure 4.8 the rate is not nearly as pronounced 
and yet there is a marked increase in the false alarm count of corn among pixels 
of pure brush had the data been registefed. 
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Figure 4.13 displays grass pixels misregistered into brush. Interestingly, 
the deterioration of the expected probability iB at about the same rate as 
that indicated for three misregistered channels in Figure 4.9. It is 
especially interesting to note that the curve for brush increases at a less 
rapid rate for a one channel misregistration of one pixel than it does for the 
three channel case. The reason for this becomes clear upon examination of 
Figure 4.14, the display of the corn false alarm rate. Surprisingly with just 
one channel misregistered more corn false alarms are detected as misregistration 
increases than with 3 channels of misregistration. This could be explained in 
that SDO 12 best discriminates corn from grass-brush mixtures. Misregistration 
of that one channel may tend to make the mixtures look more like corn in that 
channel, whereas misregistration of three channels may make the mixture less 
like corn in SDOs 2 and 17. As a result more corn false alarms are detected 
among pixels misregistered in one SDO than in three. The conclusion to be drawn 
from this observation is most obviously that effects of misregistration should 
not be overlooked even though just .one channel is in question. 

The graphs presented to this point describe not only the effects of 
misregistration on the predominant scene classes, but are also typical of 
the kinds of observations that can be made concerning other mixture combinations. 
A few more graphs will be presented in the next subsection for purposes of 
giving the reader a broader perspective on the analysis carried out. 

4.4.6 ADDITIONAL DISCUSSION OF THE EFFECTS OF CHANNEL-TO-CHANNEL 
MISREGISTRATION ON BORDER FIXELS 

The following graphs were chosen for discussion tc display various 
observations that were made concerning the effects of channel-to-channel 
spatial misregistration on the classification of S-192 data. Corn, grass, and 
brush have been previously discussed since they are the predominant scene classes. 
The examples chosen here will either (1) display a mixture for which three 
channels of misregistration causes much more deterioration of classification 
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accuracy than does one channel or (2) display an example wherein the 
misregistration of one channel causes a greater rate of false alarm of grass 
pixels than three channels. 

Figure 4.15 Is a display of the effect of three channels of misregistration 
on the classification accuracy of bare-soll brush pixels. Registered pure 
bare soil pixels that are nearest the border fall In expected recognition 
accuracy from a near 100% to 0% as misregistration Is Introduced. In Figure 4.15(c), 

the point (1,0) It Is Interesting to note that only a few percent of chese 
mixtures of mlsreglstered bare-soll brush pixels are recognized as either bare 
soil or brush. Figure 4.16 Indicates that a good percentage of these pixels 
would be mlsclasslf led as corn. Many others were called grass and a very high 
percentage went unclassified at a 0.001 probability of false rejection. 

For one channel of misregistration (Figures 4.17 and 4.18), the expected 
performance of the bare-soll brush combination was not deleterlously affected. 

This indicates that a great deal of separation from other ground covers was 
maintained In spite of misregistration. 

The next series of graphs (Figures 4.19 to 4.22) displays the recognition 
curves of mixtures of corn and bare soil. These indicate a situation for which 
a single channel mlsreglstered produces a more harmful effect than three 
mlsreglstered channels. Figure 4.19 (3 channels) in contrast to Figure 4.21 xj 
(1 channel) reveals similar expected performance curves for corn mixtures. 

However, examining Figure 4.19 as the mixtures become more like bare soil and 
the bare soil classification curve compensates by increasing more rapidly, than 
since the pixels are more like bare soil in the mlsreglstered channels. 

However, examining Figure 4.19 as the mixture become more like bare soil 
the bare soil classification curve Increases the proportion of bare soil in 
the mixture, However, in Figure 4.21 bare soil retains about the same 
classification rate regardless of the degree of misregistration of the one 
channel. Comparing the curves of grass false alarms among corn— bare mixtures, 
a remarkable Increase in the false alarm rate for one channel mlsreglstered 
(Figure 4.22) is noted In comparison to three channels (Figure 4.20). At the 
high point, one-half pixel misregistration of SDO 12 causes a 42% rate 
among corn-bare m-xtures. At one pixel misregistration, the figures are 
58% versus 38%. 
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MIXTURE PROPORTIONS (BRUSH, GRASS) FOR PERFECTLY REGISTERED PIXELS 

FIGURE 4.11. EXPECTED CLASSIFICATION PERFORMANCE OF BRUSH, 
BRUSH-GRASS MIXTURE PIXELS. ONE CHANNEL MISREGISTERED . 
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FIGURE 4.14. CORN FALSE ALARMS „^NQ GRASS AND GRASS-BRUSH 
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MIXTURE PROPORTIONS (BARE SOIL-BRUSH) FOR PERFECTLY REGISTERED PIXELS 
FIGURE 4.15 EXPECTED CLASSIFICATION PERFORMANCE OF BARE SOIL, BARE SOlL- 
BRDSH MIXTURE PIXELS. THREE CHANNELS MISREGISTERED . 
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FIGURE 4.16. CORN FALSE ALARMS AMONG BARE SOIL AND B 
MIXTURE PIXELS. THREE CHANNELS MISREGI 
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MIXTURE PROPORTIONS (CORN-BARE SOIL) FOR PERFECTLY REGISTERED PIXELS 

FIGURE 4.21 EXPECTED CLASSIFICATION PERFORMANCE OF CORN, CORN-BARE SOIL 
MIXTURE PIXELS. ONE CHANNEL MISREGISTERED. 
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4.4.7 EFFECT OF MISREGISTRATION ON STANDARD PROPORTION ESTIMATION 
A question of obvious concern is to what extent proportion estimation is 
affected by channel-to>channel spatial misregistration. It is argued 
generally that errors of one kind tend to compensate for errors of another 
kind; that is, errors are made uniformly in all directions and over a large 
sample their effects will be cancelled. The surprising corn false alarm rate 
among registered pixels of brush-*grass previously discussed already Indicates 
that the process of proportion estimation is less than an exact science. 

The increased number of false alarms to be expected with the introduction of 
misregistration places even more reliance on compensating errors for accurate 
proportion estimation. 

Figure 4.23 is presented to show that the errors introduced are not 
strictly compensatory for proportion estimation, especially when misregistration 
is Introduced in the scene. Let us focus our attention on the estimation of 
the proportion of corn. Noting an Increased rate of corn false alarms among 
brush-grass pixels, these would necessarily have to be compensated for by a 
decrease in the correct classification of corn or mixtures of corn-other pixels 
(here we use the expression correct classification in the sense that mixtures of 
two covers A and B are classified as either A or B). Figure 4.23 is a graph 
of the expected probability of "correct" classification of two ground covers as 
labelled as a function of the mixture proportion. The solid line Indicates the 
amount of brush-grass correctly classified. With more misregistration there 
are more false alarms particularly of corn, as previously noted. However the 
correct classification of corn, corn-grass or corn-brush pixels does not 
correspondingly decrease, indicating that corn may be overestimated* 

4.5 CONCLUSIONS AND RECOMMENDATIONS 

Examination of the effects of spatial misregistration on S-192 scanner 
signals centered upon an examination of expected classification performance 
for certain degrees of misregistration. In the physical sense, data are 
affected by misregistration in that the correlation between channels not in 
registration with respect to one another decreaseg, and pure field center pixels 
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that are near borders of fields may become mixtures in those channels that are 
misregistered. It is due to these physical affects on the scanner signals that 
misregistration deleteriously affects recognition performance. 

The effects on the classification of field center pixels that remain field 
center in all channels even after misregistration was found to be insignificant. 

This conclusion seemed to be independent of the number of channels misregistered. 

However, misregistration had serious effects on the correct classification 
of border and near-border pixels. First it was determined that the availability 
of pure field center signatures was affected in that fewer pixels are found to 
be pure ground covers in all channels. This Increases the number of pixels 
that are mixtures of two or more ground covers in some or all bands. Analysis 
of these mixture pixels led to the conclusions that (1) misregistration increases 
the error rate in the classification of S-192 data and (2) misregistration increases 
the false alarm rate. Increases in the false alarm rate of corn and grass were 
particularly noted. In terms of standard proportion estimation, the availability 
of fewer field center pixels, coupled with the Increased rate of false alarms 
among mixture pixels greatly increases reliance on the compensation of errors for 
accurate proportion estimation. The simulation provided evidence, in one case, 
to indicate that errors were indeed not compensatory. The effect of misregistration 
as a function of the number of channels misregistered was undetermined. In some 
cases misregistration of three channels caused more serious effects than the 
misregistration of one channel. However instances were found to indicate the 
opposite to be true as well. 

Hence, misregistration affects the processing of S-192 or any coarse 
spatial resolution scanner data in a manner that is not to be taken lightly. 

Since S-192 conic format data has already been determined to be out of 
registration to some degree, it would be difficult if not impossible to 
precisely quantify the extent to which classification accuracy has 
deteriorated due to the misregistration, however, it has been determined 
both analytically and empirically, through a simulation of the effects 
of misregistration, that the extent of the harm done could be significant. 
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As regards the processing of scan line straightened data, however, it has 
been shown (section 2.1.4) that the process of scan line straightening 
increases the misregistration in the data. Thus it is expected that the 
classification accuracy from processing scan line straightened data would decrease 
in view of the results of this section. Future scanners and data preparation 
algorithms and procedures must be designed to take every precaution to minimize 
channel-to-channel spatial misregistration in order to optimize the conditions 
under which scene classification and recognition processing are performed. 
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5 

SIGNATUBE EXTENSION 

5.1 INTRODUCTION 

Signature extension Is a process by which training statistics from one 
scene may be modified and then used to classify features In a second scene 
which differs from the first In geographic location or In the measurement 
conditions under which the data were collected. This process may also Incor- 
porate preprocessing of the data from either or both scenes. The goal of 
signature extension Is to minimize or to eliminate altogether the requirements 
for collecting ground truth and extracting training statistics for the second 
scene, thus reducing the costs and time delays associated with those procedures. 
Signature extension would then help to provide timely and cost-effective 
classification over extensive land areas, Including remote areas for which ground 
truth Information may. not be readily available. Testing, evaluation, and 
further development of signature extension techniques Is required to fully 
realize this goal. 

Several signature extension algorithms* were tested on SKYLAB S-192 data 
collected over Southeastern Michigan. These algorithms and the testing 
procedure followed are discussed below . 

5.2 TRAINING AREA 

A portion of SKYLAB Pass 14 (5 August 1973), representing data from an 
area surrounding East Lansing, Michigan, was chosen for computing training 
statistics. The atmosphere over the area appeared to be fairly clear, although 
a bank of clouds was present only five miles northwest of this site. A 
clustering algorithm [1] was used to compile the training statistics, producing 
twenty-four signatures, ten of which could be associated with major features 
within the scene. These associations were determined with the aid of aerial 
photography and SKYLAB S-190A photography using both color and false color 
film, since no actual ground observations were performed In the East Lansing 
area. 


*These algorithms were developed by ERIM for NASA/JSC under contracts 
NAS9-14123 and NAS9-9784. 
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The training statistics were extracted from S-192 data which was in conic 
format. Although it made the correlation between the cluster classification 
map and the photographic images more difficult, this data format provided better 
spatial registration between the spectral bands of the S-192 scanner than would 
have been obtained with scan-line-straightened data. The seven spectral bands 
used in the signatures were those chosen as the most optimum for processing the 
Michigan agricultural test site data, and are listed in Table 5.1, 

The ten clusters Identified from the training statistics for the East 
Lansing area appeared to be associated with features in the scene as follows: 
old residential - long established residential areas made up of closely 

spaced houses and many mature trees; green sparse vegetation - low density 
vegetated areas and also forests; green dense vegetation - high reflective 
vegetated areas such as agricultural fields and lawns (parks); concrete - 
high reflective areas mostly made up of segments of expressways and parking 
lots, or a mixture of concrete areas with other bright materials such as 
rooftops or high reflective soils; wet soil - wet unvegetated agricultural 
land, also recognized major portions of a residential district with widely 
spaced houses among mature trees; water - deep water which filled the 
instantaneous field of view of the scanner; urban - impervious materials 
such as parking lots and rooftops of large buildings (e.g,, stores, warehouses, 
and factories) ; high reflective urban - also impervious materials, higher 
signal levels than urban which may be associated with real scene features or 
localized differences in the haze layers; dry soil - freshly graded high 
reflective soil such as gravel or sand; shallow water - a mixture of water 
and shoreline signatures. These ten signatures were those employed in the 
test of selected signature extension algorithms, as described in Section 5,3. 
The other fourteen training cluster signatures classified only a few inter- 
mittent pixels within the training area and hence were not used. 
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TABLE 5.1 

SKYLAB S-192 CHANNELS CHOSEN FOR DATA PROCESSING 
IN THE TRAINING ARMA AND IN THE SIGNATURE EXTENSION AREA 


S-192 WAVELENGTH TRAINING AREA EXTENSION AREA 

BAND (pm) SDO it .SDO # 


3 

.50 - .55 

2 

1 

6 

,654 - .734 

8 

7 

7 

.770 - ,890 

10 

9 

8 

.930 - 1.050 

19 

19 

9 

1.030 - 1.190 

20 

20 

10 

1,150 - 1.280 

17 

17 

11 

1.550 - 1.730 

12 

11 
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5.3 SIGNATURE EXTENSION AREA 

A second portion of SKYLAB Pass 14 (5 August 1973), representing data 
from a swath running from Ypsilanti, Michigan to the Detroit Metropolitan 
Airport, west of Detroit, was chosen for testing the signature extension 
algorithms. This area was located less than sixty miles downtrack from the 
training area. However, the atmosphere over this scene was noticeably hazy, 
with some occasional, but small, clouds being present as well. This scene 
appeared to contain nearly the same proportions of the ten selected training 

classes as did the training scene. 

Haze would be expected to affect the scanner data in the following 
manner . First there would be an increased additive component of the sensed 
radiation due to increased path radiance . The effects of increased attenuation 
by the hazy atmosphere would also affect the radiation, but on balance it is 
expected that the resultant data values for a class viewed through a hazy 
atmosphere will be greater than the values for that class when viewed through 
a clearer atmosphere. The net effect is to reduce the signal contrast in all 
bands • 

The data available for the signature extension area was in scan-line- 
straightened format, which caused a degradation in the inter-channel spatial 
registration within this scene relative to the training scene, which was In 
conic scan format. Although the same spectral bands -were used to process this 
scene, different SDOs (Scientific Data Outputs) were chosen, when available, 
to maximize the registration between channels (see Table 5.1). 

The various processing schemes applied to the signature extension scene 

are described in the subsections below. 

5.3.1 LOCAL CLUSTERING RESULTS 

As a prelude to testing the selected signature extension algorithms, the 
clustering program was run on a subset of data (around Ypsilanti proper) 
comprising approximately twelve percent of the signature extension area. 
Although more than twenty clusters were obtained (as in the signature training 
area), only eight major clusters emerged where each represented more than one 
percent of the clustered area, and these were not in an exact one-to-one corre 
spondence with those identified in the training area. 
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The eight clusters selected from the signature extension area statistics 
3 ppeaxed to be associated with features In the overall scene as follows, 
green sparse vegetation — low reflective vegetated areas Including forests 
and some agricultural fields, a slightly more sparse vegetation signature than 
that obtained from the training area; green dense vegetation - high reflective 
vegetated areas such as agricultural fields, similar to the corresponding training 
signature, but encompassing a greater variety of features within the signature 
extension scene due to the differences in the local sparse vegetation cluster; 
old residential / urban - included parking lots and sparsely vegetated portions 
of old residential areas, surrendering the remainder of the old residential 
areas to either the local urban cluster or the local sparse vegetation cluster; 
water / residential - a mixture of a water signature with a residential signa- 
ture: developed areas along lake or river shorelines; water - deep water which 
filled the instantaneous field of view of the scanner; water / vegetation - a 
mixture of a water signature with a vegetation signature: vegetated areas 

along lake or river shorelines; soil - agricultural fields with little or no 
vegetation and vegetated areas mostly obscured by haze adjacent to the small 
clouds which were present in the scene, also some concrete; urban / residential 
partly vegetated urban and residential areas, mixtures of bright objects (roof- 
tops, concrete) with vegetation. Table 5.3 lists the percentage of the signa- 
ture extension scene recognized by each local cluster dess when these cluster 
signatures were applied to the total scene. 

5.3.2 RESULTS WITH UNALTERED TRAINING SIGNATURES 

Since the atmosphere over the signature extension area was much hazier 
than that over the training area, higher signal levels would be expected and 
one would expect the classification of the scene using the unaltered training 
signatures to be biased in favor of the higher reflectance classes. In fact 
the testing of this arrangement confirmed that expectation, with vegetated areas 
being classified in favor of the dense Vegetation, with water classification 
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TABLE 5.2 

APPROXIMATE PERCENTAGE OF THE TRAINING SCENE 
COVERED BY EACH TRAINING CLASS 


TRAINING 

CLUSTER 

2 

3 

4 
6 
8 

13 

14 

17 

18 
20 


CLUSTER IDENTIFICATION 

old residential 

green sparse vegetation 

green dense vegetation 

concrete 

wet soil 

water 

urban 

high reflective urban 
dry soil 
shallow water 
unclassified 


TRAINING AREA PERCENTAGE 
(50250 PIXELS) 

10.7 

37.2 

14.8 

6.2 
9.0 
0.7 

10.8 

8.2 

1*2 

1.2 
0.1 
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TABLE 5.3 

APPROXIMATE PERCENTAGE OF THE SIGNATURE EXTENSION SCENE 
COVERED BY EACH LOCAL CLUSTER CLASS 


EXTENSION 
CLUSTE R # 

1 

4 

6 

9 

11 

14 

17 

23 


CLUSTER IDENTIFICATION 

green sparse vegetation 
green dense vegetation 
old residential / urban 
water / residential 
water 

water / vegetation 
soil 

urban / residential 
unclassified 


EXTENSION AREA PERCENTAGE 
(85250 PIXELS 

43.4 
19.1 

1.9 

0.6 

0.9 

0.9 

12.7 

19.5 

1.0 
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biased in favor of shallow water recognition and with urban signatures 
dominating over residential signatures. In addition > especially hazy areas, 
some bright urban or residential areas, and some areas of concrete were 
recognized by the dry soil signature. The percentage of the signature exten- 
sion scene recognized as each training class, using unaltered signatures, is 
listed in Table 5.4 together with the corresponding percentages recognized 
after applying each of the signature extension techniques discussed below. 

5,3.3 RESULTS WITH DARK OBJECT ADDITIVE SIGNATURE CORRECTION 

The dark object signature correction [ii] assumes, channel-by-channel, 
that the signal levels generated by dark objects (objects of low reflectance 
and/or low irradiance) represent path radiance and therefore provide a means 
to estimate an additive correction to the mean levels of each training 
signature in each channel. In an attempt to avoid using correlations between 

spurious or anomalous low signal levels, low end of the histogram continuum 
is judged to be the most appropriate reference point for the algorithm. Since 
spurious or anomalous gaps in the histogram continuum are also possible 
artifacts of any scene, this algorithm is not by any means foolproof. The 
algorithm also provides only an additive signature correction, whereas it is known 
from study of mathematical models for signature variations that a multiplicative 
signature correction would be desireable as well. 

Table 5.5 lists the additive changes to the training signature means which 
were determined by the dark object signature extension algorithm. Also listed 
are the corresponding training signature changes resulting from the other 
algorithms discussed below. Note that the dark object algorithm generated 
larger corrections (in counts) for the 'shorter than for the longer wavelength 
bands, as might be expected from the physical cause of the differences between 
the training and signature extension scenes (l.e. haze). 

The percentage of the signature extension scene recognized as each training 
class, after application of the dark object algorithm, is listed in Table 5.4. 
These results may be compared to the local cluster classification results 
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TABLE 5.4 

PERCENTAGE OF THE SIGNATURE EXTENSION SCENE 
CLASSIFIED AS EACH TRAINING CLASS 


CLUST 

CLUSTER IDENTIFICATION 


UNALTERED DARK OBJECT MEAN LEVEL 

SIGNATURES CORRECTION CORRECTION 


old residential 

green sparse vegetation 

green dense vegetation 

concrete 

wet soil 

water 

urban 

high reflective urban 
dry soil 
shallow water 
unclassified 
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TABLE 5.5 


SIGNATURE CORRECTIONS DETERMINED BY EACH SIGNATURE EXTENSION ALGORITHM (IN COUNTS) 


S -192 
BAND // 

WAVELENGTH 

(Mm) 

DARK OBJECT , 

CORRECTION 

(ADDITIVE) 

MEAN LEVEL 
CORRECTION 
(ADDITIVE) 

MASC 

(ADDITIVE) 

(MULT.) 

3 

.50 - .55 

9 

9.07 

40.87 

.529 

6 

.654 - .734 

9 

8.94 

19.63 

.741 

7 

.770 - .890 

10 

11.32 

- 10.16 

1.195 

8 

.930 - 1.050 

3 

9.75 

- 11.70 

1.146 

9 

1.030 - 1.190 

5 

8.12 

- 17.15 

1.217 

10 

1.150 - 1.280 

-1 

7.48 

- 8.20 

1.081 

11 

1.550 - 1.730 

0 

- .15 

- 1.16 

.862 
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listed in Table 5.3, bearing in mind that some of the local cluster categories 
do not correspond exactly to the training cluster categories. Classification 
iuaps also were generated and were compared with aerial photography. Generally 
the dark object classification of the signature extension scene was judged to be 
a dramatic improvement over classification with unaltered training signatures, 
although there was evidence that the algorithm over-corrected for the differences 
between the training and signature extension scenes. In particular the recognition 
where the haze was densest was unexpectedly accurate, while in areas where the 
haze density was closer to the average for the scene there was a tendency to 
classify some urban areas as old residential areas and to classify marginal 
concrete areas as urban. Water recognition, however, was accurate. This 
tendency to misclassify bright features as darker features while correctly 
classifying the darkest features correlates with the effect of excluding a 
multiplicative signature correction for the effect of the haze. 

5.3.4 RESULTS WITH MEAN LEVEL ADJUSTMENT SIGNATURE CORRECTION 

The mean level adjustment algorithm [12] utilizes the correlation 
between averages over portions of the training scene and the signature exten- 
sion scene to estimate a correction to the mean levels of each training 
signature in each channel an additive correction in this case. Alternatively, 
a purely multiplicative correction could be estimated; however in this experiment 
the difference between the training and signature extension scenes (hazy density) 
would be expected to produce a mostly additive effect. The algorithm reiquires that 
the portions of the two scenes whose averages are to be compared be of similar 
composition (i.e., contain similar percentages of each ground cover) . Table 5.2 lists 
the approximate percentage present of each training class in the portion of the 
training scene which was averaged for this algorithm, while Table 5,3 lists 
the percentage for each local cluster class in the portion averaged from the 
signature extension scene. Although differences between local cluster categories 
and training cluster categories prevent a complete comparison between the data 
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in these tables, a general similarity between the two scenes is evident. 

Close Inspection of the false color IR photography for these two areas 
revealed that some of the dissimxlarxties tended to balance each other, but 
that the signature extension scene appeared to have a slightly greater per- 
centage, overall, of brighter features. 

Since implementation of the mean level adjustment algorithm, like 
the dark object algorithm, provided only an additive signature correction, it 
might be expected to be only partially effective in general. In this particular 
application it was judged to be only slightly less effective than the dark 
object algorithm, with its results a bit more biased toward over-correction of 
the difference between the training and signature extension scenes. The bias 
toward bright features in the average over the signature extension scene apparently 
led to a mean level signature correction which biased the modified training 
signatures in favor of less bright materials. 

The additive signature corrections generated by the mean level adjustment 
algorithm are listed in Table 5.5. Note that the corrections for the shorter 
wavelength bands are nearly the same, overall, as those for the longer wavelength 
bands. Of course the relationship between counts and radiance is not being 
considered here, as perhaps it should be, however the difference between the 
mean level adjustment classification results and the dark object results lies 
mostly in the creatment of the longer wavelength bands. It appears that this 
difference reflects the fact that the mean level adjustment results are slightly 
more biased in favor of darker materials and that the longer wavelength bands 
show more contrast between the features of the scene than do the shorter wave- 
length bands. 

5.3.5 RESULTS WITH tIASC 

The Multiplicative and Additive Signature Correction (MASC) [111 employs 
a least squares regression to match training cluster mean signal levels with 
local cluster mean levels , based on the ordering and spacing of those signature 
means within a chosen data channel. The data channel selected for comparing the 
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ordering and spacing of the clusters within the two signature data sets is 
used to define exclusive paired matches between training clusters and local 
clusters. Extra clusters are discarded from the larger cluster set so that 
the obtainable matching between the remaining clusters is maximized . This 
matching is achieved by a least squares determination of appropriate multi- 
plicative and additive coefficients in each data channel. Mathematical models 
of expected signature variations (changes in the atmosphere, in the Illumination 
of the scene, and in the scanner responsivity) predict that these variations 
should be both multiplicative and additive, hence a proper association between 
clusters of the training data set and clusters of the local data set should 
produce a realistic signature correction from the MA.SC algorithm. 

The MASC algorithm was implemented using the 10 Lansing area clusters 
and the eight test area clusters previously mentioned. Table 5.6 lists the 
cluster associations determined by the MASC algorithm, based on using S-192 
Band #11 (1.550-1.730 pm) to order the clusters. This band was chosen for 
the cluster ordering because it had been determined to be the single most 
useful band for classifying the Michigan agricultural test site data. Note 
that the cluster pairings obtained are not optimum. This appears to have 
occurred because one band does not adequately separate all classes; a minimum 
of two channels would have been needed in this case to achieve an unambiguous 
separation of the cluster classes. Another aspect of this data set was that 
there was not a good one-to-one correspondence between the clusters in the 
training and signature extension data sets . The multiplicative and additive 
coefficients determined for this cluster pairing arrangement are listed in 
Table 5.5. 

In order to facilitate a comparison between the MASC coefficients and the 
purely additive coefficients of the dark object and mean level adjustment 
algorithms. Table 5.7 has been generated. The additive coefficients listed in 
Table 5.7 represent the change in the signatures for the darkest material (water) 
and for the brightest material (dry soil) that result from applying the 
multiplicative and additive coefficients of MASC. These may be compared with 
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TABLE 5.6 

TRAINING AREA AND SIGNATURE EXTENSION AREA CLUSTER ASSOCIATIONS 
SELECTED AND OPTIMIZED BY THE MASC ALGORITHM 


Training Training Cluster 

Cluster # Identification 


2 

3 

4 
6 
8 

13 

14 

17 

18 
20 


old residential 

green sparse vegetation 

green dense vegetstion 

concrete 

wet soil 

water 

urban 

high reflective urban 
dry soil 
shallow water 


Local MASC 

Cluster # Associated Cluster 

6 old residential / urban 

4 green dense vegetation 

23 urban / residential 

9 water / residential 

11 water 


1 green sparse vegetation 

17 soil 

14 water /vegetation 
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TABLE 5.7 

EQUIVALENT PURELY ADDITIVE CHANGES TO SIGNATURE MEANS OF WATER AND DRY SOIL TRAINING CLASSES 

(IN COUNTS) 

Water Training Signature Dry Soi l Training Signature 


S-192 
Band // 

Wavelength 

(vm) 

Unaltered 
Mean Value 

MASC Equivalent 
Change 

Unaltered 
Mean Value 

MASC Equivalent 
Change 

3 

.50 - .55 

55.99 

14.52 

79.16 

3.62 

6 

.654 -.734 _ 

41.42 

8 .88 

102.39 

-6.94 

7 

.770 - .890 

24.64 

-5.36 

79.68 

5.37 

8 

.930 - 1.050. 

26.43 

-7.85 

93.42 

1.92 

9 

1.030 - 1.190 

25.08 

-11.70 

90.55 

2.54 

10 

1.150 - 1.280 

21.01 

-6.49 

92.24 

-.70 

11 

1.550 - 1.730 

13.66 

-3.04 

76.97 

-11.76 
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the coefficients for the dark object and mean level adjustment algorithms which 
are listed in Table 5.5. It should be noted that the variance and covariance 
values of the signatures were also affected by the multiplicative coefficient 
in this application of MA.SC. 

Note in Table 5.6 that the association of the concrete and high reflective 
urban training clusters with lower reflectance local clusters (urban / residential 
and green sparse vegetation, respectively) would tend to bias this MASC 
classification of the signature extension scene toward brighter materials. 

In fact such a bias was observed, with deep water areas mostly classified as 
shallow water, with residential areas classified as urban, and with urban areas 
recognized by the concrete signature. This bias in the recognition is indi- 
cated by the small positive or sometimes negative equivalent additive changes 
the signature meens for the longer wavelength bands, listed in Table 5.7. 

This result actually represented a small step backward from using the training 
signatures without alterations . 

It appears that further algorithm development, addition of some safeguards 
against misassociation of clusters, and/or some Intervention by the analyst 
are required for the MASC algorithm to realize its full potential. Some 
specific recommendations for improving the MASC algorithm, based on its 
observed performance with this data set, are discussed in Section 5.3.7. 

5.3.6 RESULTS WITH ADAPTIVE PROCESSING 

Adaptive processing [13] using a decision-directed Kalman filter, was 
also tested on the S-192 data set by generating recognition maps from local 
cluster signatures and from MASC signatures, but no noticeable improvement in 
the clff.ssif ication of the scene was observed. It appears that the variations 
in the density of the haze over the signature extension scene were sufficiently 
localized so that a fate of signature adaptation which would be able to correct 
for the haze adjacent to a small cloud would also react to local changes in the 
material composition of the scene, leading either to signature capture or to 
localized biases in the classification. It seems that in order for adaptive 
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processing to improve upon results obtained from conventional techniques, 
the signature variations should occur on a scale in time or space which is 
noticeably greater than the scale of localized changes in scene composition. 

5.3.7 COMPARISON OF RESULTS 

Of the signature extension techniques tested on this S~192 data set, 
the dark object correction appeared to do surprisingly well, with the mean 
level adjustment additive correction being a not-too-distant second best. 

MASC, on the other hand, did not do as well as expected, even less well than 
using training signatures without any alterations. Although these results 
run somewhat contrary to recent experiences [11] with some LANDSAT data sets, 
this surprise serves to bring out more clearly perhaps some of the advantages, 
disadvantages, and needs for improvement in these algorithms. Some specific 
observations in this regard are discussed below. 

The surprisingly good performance of the dark object algorithm with the 
chosen S-192 data set may have been aided by the nature of the difference 
between the training scene and the signature extension scene (i.e., atmospheric 
haze) which might have caused a change in the signal levels which was mostly 
additive. This suggests that the cause of the signal change from one scene 
to another is a consideration in selecting an optimum signature extension 
algorithm for a particular application. 

The mean level adjustment signature correction algorithm requires that 
the training scene and the signature extension scene be similar in composition 
of classes. Apparently, in this S-192 data set the training and signature 
extension scenes were sufficiently similar so that, with the differences between 
the two scenes being mostly additive, relatively good classification of the 
signature extension area was obtained. The requirement for statistical simi- 
larity between scenes, however, may be too restrictive for similar good results 
to be expected in other applications of the algorithm. 
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MA.SC» although in this instance, nerforming poorly, potentially is the 
most powerful of those techniques tested with this S-192 data set. It pro- 
vides for both an additive and a multiplicative correction in each channel 
of each signature and does not require the degree of statistical similarity 
between scenes that is needed for the mean level adjustment algorithm. How- 
ever, it does require that the clusters obtained from each of the scenes 
represent similar classes. The disappointing performance of the MASC algorithm 
used with this S-192 data set appears to have been caused by its only partial 
capability to identify and avoid the prejudicial effects of anomalous clusters 
(^tjose without counterparts in the other cluster set) . Since clustering 
algorithms probably cannot be expected to produce sets of signatures from two 
different scenes which are in a close one-to-one correspondence, some method 
is needed to identify non-correlating clusters and to edit them out of the 
cluster matching procedure which is the groundwork for calculating the signa- 
ture corrections. This editing process could be aided by including more than 
one data channel in the cluster matching algorithm. Using more than one data 
channel would also help to increase accuracy in identifying the proper pairing 

between the clusters that remained. 

Adaptive processing improves performance only when gradual changes 
of the measurement conditions occurs over a scene. Also, there probably is 
a tendency, when choosing test cases for signature extension, to select 
training and extension data sets over which the measurement conditions are 
fairly uniform, in order to better assess the performance of the non-adaptive 
signature extension algorithms. Such -test cases might use adaptive processing 
as merely a way to perform a fine-tuning adjustment on the extended signatures 
In the present Instance, localized variations in the haze density over the 
signature extension scene caused too much variability for adaptation, to 
properly establish such a fine-tuning adjustment in time to affect the classi- 
fication over the most important part of the signature extension scene (i.e., 
the beginning), while a more rapid rate of adaptation led to signature capture 
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In summary, although the dark object signature correction appeared to 
do the best with this S-192 data set, it is believed that, among the signature 
extension algorithms tested, MA.SC has the most ability to improve and to grow 
to produce the best performance in the long run. Following original develop- 
ment of MASC, improvements and modifications to the basic approach are being 
pursued at ERIM as well as at other institutions. 
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6 

MIXTURES AND SUBRESOLUTION ELEMENT PROCESSING 

When a spatial resolution element overlaps the boundary between two or 
more ground classes, the radiation detected will be a mixture from the classes 
involved . The spatial resolution of the SKYLAB S-192 scanner is such that 
compared to the size of the fields or areas of the ground cover classes, the 
frequency of mixture pixels is fairly large. An analysis of this effect for 
an agricultural site is presented in 6.1 below. Further, situations arise 
where the classes of interest are smaller than the system's resolution. The 
use of conventional multispectral processing techniques on mixture pixels, 
will likely result in the improper classification of these pixels. If the 
number of mixture pixels in the data is large, processing errors can be 
expected to be numerous as well. In cases where the objects of interest are 
too small to be resolved, standard processing would be incapable of proper 
classification for that class. 

Processing a sizable number of mixture pixels using conventional 
processing techniques has a major impact on the accurate estimation of 
proportions or acreages of classes in the scene. Such processing techniques 
rely on compensating errors to cancel the measureable effects of misclassifi~ 
cations or on some fixed bias in the estimate to produce accurate proportion 
estimates. That misclassif ication errors do not compensate is shown, in 
Section 6.1 below, by means of the same simulation techniques previously 
described in Section. 4. 

For the past several years ERIM has been developing special processing 
techniques* [14, 15, 10] to handle such situations. In this section we present 
the results of two studies where mixtures processing was applied to S-192 
data. As previously mentioned, two processing studies had been carried out 
on the S-192 data: The first being for the agricultural test site, and the 

second boing for the urban and suburban areas around Lansing, Michigan. For 
the agricultural area, the use of mixture processing techniques was used to 
try to better estimate the proportion of the classes of interest in the 

*The development of these techniques • has been supported by NASA/ JSC 
under contracts NAS9-9784 and NAS9-14123. • 
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scene. The second study utilized mixtures processing techniques to estimate 
the proportion of vegetative matter in an urban scene. Discussions of these 
studies are presented in Sections 6.3 and 6.4, respectively. 

6.1 IMPACT OF MIXTURE PIXELS ON PROPORTION ESTIMATION 

In assessing the impact of the standard processing of mixture pixels 
on proportion estimation, a first consideration is the proportion of the 
pixels in the scene which are mixture pixels. This information would give 
an indication as to the severity of the problem - - the larger the proportion 
of mixture pixels, the greater the likely impact on proportion estimation. 
Earlier in this report we addressed the problem of locating pure field center 
pixels and noted the substantial number of border pixels for just the larger 
fields in the agricultural test site. 

To more directly assess the number of mixture pixels in the scene, one 
section (1 mile square) of the agricultural test site was selected; field 
boundaries for all fields were drawn on a map and the number of pure 
pixels and mixture pixels were counted. In the counting procedure, pixels were 
deemed to be field center if their edges were more than .3 pixel from a 
boundary in the scan (points) direction and more than .1 pixel in the along 
track (lines) direction, thus accounting for the effects of resolution 
element size and the misregistration of the bands. 

Section 109, the section selected, was chosen because it had the same 
number of fields and about the same number of acres as the average over 
sections: 31 fields, 616 acres. The map displaying the fields and pixels 

is shown in Figure 6.1. As noted, out of a total 514 pixels, only 152 or 
30% of them were pure field center pixels and the other 362 or 70% were 
mixture pixels. Furthermore, it seemed from the analysis that if the data 
had been perfectly registered, the number of mixture pixels would not have 
been significantly reduced. Thus it can be safely concluded that the 
majority of pixels being considered in this agricultural scene are mixture 
and not field center pixels. 
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Total Fields = 31 

Total Acreage = 616 

Total Pixels * 514 100% 

Field Center Plxelssl52 30% 

Mixture Pixels = 362 70% 


FIGURE 6.1. DISPLAY OF MIXTURE PIXELS IN LOCKE TOWNSHIP SECTION 109 
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This situation has major impact on the accurate estimation of proportions. 
The more mixture pixels in a scene that can be spuriously classified, the more 
the accurate estimation of proportions is dependent upon compensation of the 
errors. Consider for the moment a mixture pixel of two classes, say trees 
and grass. Using standard proportion estimation procedures, it would be hoped 
that such a mixture pixel would be classified as either trees or grass and 
that the number of times such pixels fall in either class is equal to the 
overall proportion of grass and trees found in all such mixtures. Should a 
disproportionate number of false alarms, that is detections of a third class, 
occur among this mixture of trees and grass, then the task of accurate proportion 
estimation becomes more difficult and an even greater reliance is placed on 
the compensation of errors# 

The simulation technique used in the analysis described in Section 4 
was applied to measure how prevalent a problem the false alarm rate could be 
in the given S-192 data set. Recall that five signatures, for corn, grass, 
tree, bare soil and brush, were chosen and mixtures of all possible pairs of 
these crops were simulated in proportions (1/3, 2/3) and (2/3, 1/3). Table 
6.1 displays the expected performance for the recognition classes. Given a 
mixture of crops A and B, one would hope that the sum of the percentage of those 
mixture pixels classified as A and those classified as B would be close to 
100%. The difference would be the number of false alarms detected. In 
examining the last column of Table 6.1, one finds that the false alarm rate 
is by no means insignificant. The lowest false alarm rate detected is 10% 
while the highest rate is 76%. 

Thus it appears that the number of false alarms from mixture pixels is 
significant when the pixels are classified using conventional techniques. 

What, then, does this mean in terms of overall accurate proportion estimation? 

Going back to the tree-grass example cited above, consider that a high 
rate of corn classifications occurs among pixels which are mixtures of trees 
and grass. Such false alarms would need to be compensated for by a decline in 
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TABLE 6.1 EXPECTED PERFORMANCE FOR RECOGNITION OF SIMULATED 
SKYLAB MIXTURE PIXELS BASED ON THE BEST LINEAR 
DECISION BOUNDARIES BETWEEN FIVE SKYLAB FIELD 
CENTER SIGNATURES 


% ASSIGNED TO CLASS 


MIXTURE 

PROPORTION 

TREE 

GRASS 

SOIL 

BRUSH 

CORN 

UNC 

CLASSIFI- 

CATION 

ALARMS 

TREE-GRASS 


m 

44.0 

0.8 

mm 

23.4 

5.2 

48.2 

51.8 

(2/3, 1/3) 

31.6 

10.2 



21.2 

12.0 

41.8 

58.2 

TREE-SOIL 

(1/3, 2/3) 

4.6 

8.8 

56.6 



17.8 

61.2 

38.8 

(2/ 3, 1/3) 

47.2 

6.6 

8.2 

0.6 

13.0 

24.4 

55.4 

44.6 


(l/3,2/3) 

6.0 

mm 

0.0 

67.8 

16.8 

5.0 

73w8 

26.2 

jdKU on 

(2/3, 1/3) 

32.4 

3.6 

0.0 

37.6 

15.6 

10.8 

70.0 

30.0 

TREE-CORN 


9.6 

5.0 

0.0 

27.6 

52.8 

.5.0 

62.4 

37.6 

(2/3, 1/3) 

39.0 

2.2 

0.0 

26.2 

25.2 

Bl 

74.2 

25.8 

GRASS-SOIL 

(1/3, 2/3) 

0.8 

27.0 

63.0 

0*0 

Bi 

6.6 

90.0 

10.0 


■a 

69.8 

15.4 

IB 

7.0 

5.4 

85.2 

14.8 

A OO tl'DTTCIJ 

(1/3, 2/3) 

0.6 

20.2 

0.0 

51.2 

23.4 


71.4 

28.6 

uKaO o— on 

(2/3,l/3) 

D 

50.2 

0.0 

11. k 

22.2 


72.6 

27.4 

r*nACO rrioxT 



29.0 

0.2 

10.6 

55.6 

3.0 i 

84.6 

15.4 

GKAod— OU lxW 

(2/3, 1/3) 

■a 

60.2 

0.4 

l.C 

28.0 

2.4 

88.2 

11.8 

CnTT ■DOTTOU 


8.6 

28.0 

1.0 

. 23.0 

31.0 

8.4 

24.0 

76.0 

oUXLt— dKUoiI 

(2/3, 1/3) 

7.2 ■ 

28.6 

32.8 

IB 

16.2 

13.8 

34.2 

65.8 

SOIL-CORN 

(1/3, 2/3) 

4.8 

31.2 

4.6 

1.6 

50.2 

7.6 

54.8 

45.2 

(2/3, 1/3) 

ID 

22.6 

48.8 

0.0 

13.0 

12.2 

61.8 

38.2 

BRUSH-CORN 

(1/3, 2/3) 



0.0 

■ 



3.0 

86.6 

13.4 

(2/3, 1/3) 

■g 

6.8 

0.0 

59.6 

28.6 

3.8 

88.2 

11.8 

FINAL 

ESTIMATION O: 
BY CLASS** 
(CORRECT * 1' 

? PROPORTION 
00%) 

52% 

118% 

58% 

104% 

■ 

128% 

. 

40% 




% CORRECT*! 


X FALSE 


* Assigned to one of the two classed considered. 
** Assuming all above mixtures equally likely. 
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the classification rate of pure corn pixels and/or by offsetting false alarms 
of other classes among corn and corn-mixture pixels. This then triggers a 
chain-reaction of other compensations within other classes. The odds of this 
all happening so that the errors do indeed compensate, would seem to be very 
slight . 

In referring back to Table 6.1, the bottom line shows strikingly that, 
for this data set, the errors would not compensate. Tree and bare soil classes 
are grossly underestimated while corn is significantly overestimated among mixture 
pixels . 

It is dear > then^ that significant numbers of mixture pixels, when processed 
by conventional means, will yield significant numbers of false alarms. Further, 
the odds that significant numbers of false alarms will compensate one with another 
so that estimation of proportions of classes may be accurately accomplished using 
classification counts from conventional classifiers seems rather small. To 
complete this study, an investigation of whether there is a fixed, estimable bias in 
the proportion estimates is needed. 

6.2 BRIEF DESCRIPTION OF THE MIXTURE PROCESSOR 

For the example discussed in the previous section, the task of accurately 
estimating proportions of classes in a scene where a significant portion of 
the pixels are mixture pixels could not be done by using conventional 
classification processing techniques. In the following sections we discuss 
the application of a specialized processor, here called the mixtures 
processor, which allows for the fact that pixels may contain mixtures of 
different ground covers, and is capable of analyzing the proportions of the 
classes present in each pixel. 

Before proceeding further, a short explanation of the manner in which the 
mixtures processor is applied is in order. 

It is obvious that a pixel may be purely or almost purely of one ground 
class, or it may be a mixture of several ground classes. Thus the algorithm 
used, as its first stage,* determines the several likeliest possibilities. 

First, the most probable single signature for a pixel, and the attendant chi- 
square value are determined. (The chi-square value is a measure of the 
likelihood that the pixel is a member of the signature distribution being 
considered.) Next, the likeliest mixture of two classes is calculated and the 
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proportion of each class in the pixels and an associated chi-square value is 
calculated* The pixel may be further analyzed as a mixture of three and four 
classes. For reasons of processing time and computer space requirements, for 
the agricultural test site part of this study we limited the consideration to 
either pure or two-class mixture pixels* This is not an unrealistic restriction 
for this case when one considers the scan swath over the ground: For an 

agricultural area like the current data set, most mixture pixels will occur 
at field boundaries so that the vast majority of such pixels will be mixtures 
of two ground classes* Figure 6*1 also provides an illustration of this situation* 

The data are then processed through a second stage where a pixel is 
determined to be a pure pixel if the chi-square value for the likeliest pure 
case is less than some threshold T^. If it is not pure according to this 
test, then the chi-square value for the two-class mixture case is compared 
to a second threshold If it is less than the pixel is determined to 

be the mixture indicated; otherwise, further tests with T^, etc. are 
conducted when three and four class cases are considered. If the pixel falls 
all the tests, it is considered to be from a class or classes not included 
in the signature set. Currently the thresholds T^, etc., are chosen 
empirically so as to minimize the error of the proportion estimate over some 
training area of known proportion* 

The chief factor affecting the performance of the mixtures processor is 
the geometrical configuration of the signatures used to define the ground 
cover classes. The signatures can be defined as hyperellipses in an n-dimensional 
orthogonal space where n is the number of bands or SDOs. A simplex is a 
hypervolume defined by m vertices, where a signature mean defines each vertex. 

A pure pixel would be one which is located near a signature mean, while a 
mixture pixel would be one which was located between several of the signatures. 
Further, if for a given set of signatures, the simplex they define is not 
convex; e.g., one signature being a linear combination of some other signatures, 
then the simplex is said to be degenerate. For such a simplex, a non-unique 
answer is mathematically possible and as a result such s imp lexes should not 
be used for processing. 
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6.3 APPLICATION OF MIXTURES PROCESSOR TO AN AGRICULTURAL SCENE 

The initial step in implementing the mixtures processor is to define a 
signature set. It is important that the signatures used be sufficiently 
distant one from the other; that is, the simplex formed by the set of signatures 
cannot be degenerate, otherwise the algorithm breaks down. For this reason 
it is wise to limit the number of signatures used. Also, since the processing 
time goes as m(m+l)/2 (for m signatures), there is a second reason to keep 
the size of the set as small as possible. 

For the agricultural test site the set of 15 signatures used for the 
classification had the following composition: 

CORN 4 Signatures 

TREES 2 Signatures 

BRUSH 1 Signature 

GRASSES, WEEDS, ETC. 5 Signatures 

BARE SOIL 1 Signature 

SOYBEANS . 1 Signature 

ALFALFA 1 Signature 

Since soybeans and alfalfa are very minor ground covers in the test site, 
we excluded them from this study. An analysis of the tree and brush signatures 
showed the two tree signatures to be very disparate, but the brush and one of 
the tree signatures were found to be very similar spectrally — • overlapping 
some 75%. The brush signature, representing primarily areas of scrub forest, 
was therefore combined with the one tree signature. As for the corn 
signatures, the two signatures with most of the corn points were found to be 
very different; since corn is a major cover, both these signatures were 
selected for use. The bare soil signature also was included. 

The grasses were represented by 5 diverse signatures. Since combining 
several signatures into one resultant signature with a large spread would 
have decreased the inter-signature distances in the simplex, we endeavored 
to choose just one signature. An examination of 2-dimensional scatter plots 
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of all the signatures indicated that one grass signature seemed to be more 
toward the exterior of the total signature simplex than any of the other grass 
signatures. That cluster probably represents the grass subclass which had 
the highest percentage ground cover and thus the lushest condition of the 
grass object class. This grass signature was selected to represent grass 
with the hope that pixels from pasture or weed fields would be called a 
mixture of grass and bare soil. 

The signature set described above was applied to a small 550 pixel 
section of the data. Subsequent analysis showed that very little of the data 
were being called out as grass, and as a result the error rate was substantial. 

It seemed that the initial choice of a grass signature was a poor one. 

Accordingly, a different grass signature was selected, this one being from 
the grass cluster containing the greatest number of grass pixels. 

The test data subset was again processed through the mixtures classifier. The 
results were somewhat better, but the total error in the proportion estimation 
for the test data subset was still slightly inferior to the error rate achieved using 
the normal, i.e., ].inear maximum likelihood, classifier. It was further 
noted that the chi-square thresholds chosen, which minimized the total error 
of the proportion estimate, resulted in 73% of the pixels being counted as 
”pure’^ and only 18% of the pixels being assessed as mixtures. Many more 
mixture pixels had been anticipated. 

One hypothesis that might explain these results is that the conventional 
classification had been done using 15 signatures > — the mixtures approach used 
only six. It seems that it would be necessary to further pack the signature 
simplex with other grass signatures so as to increase the grass classification 
rate.’ Such a procedure would increase the grass classification, but it would 
further decrease the number of pixels processed as mixtures. 

That few pixels were called out as mixture pixels seems to be another 
result of the poor signal range discussed in Section 2. The signature set 
is such that not only are the means relatively close together, but also the 
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individual distributions are very broad so that pixels which are mixtures of 
separate classes are themselves very near the center of some distribution so 
that they would be classified as being from that distribution. Figure 6.2 
illustrates the point, and the reader is referred back to Figures 3.1 and 
3.2 for further illustrations of this point using the S-192 data. Because 
of these results, no further mixtures processing was performed on the 
agricultural test site data. 

6.4 APPLICATION OF MIXTURE PROCESSOR TO URBAN AREA 

As a second exercise, the mixtures processor was used to classify two 
small portions of data from the urban area of Lansing, Michigan, using 
signatures acquired by clustering the data. This portion of the data and 
the training methods used were specified in Section 5.1 of this report. 

For this exercise, we were interested in determining the amount of 
vegetative material, or alternatively of impervious materials, in an urban 
area. Such information is of use to geographers, and urban planners and impacts 
local urban climatology, etc. In this ease it was expected that most of the 
classes of Interest would be smaller than the resolution size of the scanner. 

In other words, it was expected that each pixel would be a mixture of two, 
three or even more classes. 

Initially , five of the signatures from the set were identified as being 
classes of interest for this problem: green vegetation, concrete, other 

impervious (rooftops, asphalt etc.), bare soil, and water (there is a river 
which runs through the city). This signature set was analyzed using program 
GEOM. This program calculates a measure of separateness (in a probability 
sense) for each signature mean in the simplex. The measure calculated is 
roughly the distance in standard deviations between the signature mean and 
the hyperplane through the other signature means. If the distance for a 
given signature is small, then the simplex is liable to be degenerate and 
the mixtures algorithm will not work well. The results. Table 6.2, show that 
the simplex of these five classes is degenerate —concrete, other impervious 
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and bare soil each overlap with the simplex formed by the other four signatures. 
Additionally the other two distances are small. 


TABLE 6.2 

GEOM RESULTS FOR AN URBAN 5 SIGNATURE SIMPLEX 


CLASS 


GEOM DISTANCE 


Green Vegetation 2.98 
Concrete 0.28 
Water 1.99 
Other Impervious 0.26 
Bare Soil 0.58 


Investigating further, we tried all 4-tuples to see if some of these 
simplexes would not be degenerate. All were degenerate. Next all triplets 
of signatures were tried and here several of the combinations yielded non- 
degenerate simplexes. From these results the triplet of concrete, other 
impervious and green vegetation was chosen for the processing effort, since 
it seemed that these classes would be the most prominent in the scene. The 
GEOM results for this triplet are given in Table 6.3 below. 


TABLE 6.3 

GEOM RESULTS FOR FINAL URBAN SIMPLEX 


CLASS 

GEOM DISTANCE 

Green Vegetation 

6.06 

Concrete - 

3.0 

Other Impervious 

4.5 


The fact that simplexes with more than three signatures were degenerate 
indicates that only two out of seven channels were important for separating 
these classes — the other five being redundant. This follows since for the 


128 


2pi 


FORMERLY WILLOW RUN LABORATORIES. THE UNIVERSITY OF MICHIGAN 


spectrally disparate classes involved, a non-degenerate simplex existed only 
for some triplets of the signature set, and each triplet in turn defines only 
a plane (2-space). Thus there are only two independent channels for this 
problem. 

The mixtures processor described in the previous section was Implemented 
to process the data, using the three signatures mentioned above. It was 
noticed that, for most of the pixels, low chi-square values were being 
calculated for the best one-at-a-time case — i.e, , that it was most probable 
that the pixel was pure. The rest of the pixels were deemed most likely to 
be mixtures of a pair of classes. Only a few pixels were deemed to be mixtures 
of the three classes. The results were also poor, with the other impervious 
signature overestimated and the vegetation greatly underestimated. 

6.5 CONCLUSIONS 

It was concluded from the results of both studies that the lack of 
adequate dynamic range, as demonstrated here by the size of the simplex in 
relation to the size of the class distributions, precluded the possibility 
for most of the pixels to be processed as mixture pixels since the pixels 
were associated with higher probabilities of being pure. The mixtures processor 
discussed in this section cannot be expected to yield good results under 
these circumstances . . 
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7 

CONCLUSIONS AND RECOMMENDATIONS 

In preparation for the processing and analysis of SKYLAB S-192 data, a 
fairly detailed examination of the data was undertaken, investigating in each 
SDO (Scientific Data Output) signal-to-noise characteristics and dynamic 
range. Aircraft scanner data gathered over the agricultural site the 
morning of S-192 data collection were examined also and .used as a basis for 
comparison. The results of the examination of S-192 data quality were 
essentially in keeping with the published S-192 performance evaluations [4]. 
Conclusions reached were that four of the spectral bands were sufficiently 
noisy so as not be of use in classification processing and that the 
remaining bands all had a very limited range of values in relation to the 
noise content of the data. Also examined was the spatial registration of the 
scanner data. The SDO-to-SDO misregistration in conic data was measured and 
shown to be greater than one pixel in some instances. More Importantly, 
further analysis showed that the effect of scan-line-straightening was to 
compound and increase the misregistration of the S-192 data: a maximum 

misregistration of 2.2 pixels was calculated. Not only is the misregistration 
of scan-line-straightened data not easily correctable but the additional 
misregistration seriously reduces the number ^ pure pixels available for 
training. 

Analytical and simulation studies were performed to investigate the 
effects of misregistration on classification accuracy. The results showed that, 
for pixels which imaged more than one ground class in one or more channels, 
th.e error rate was substantial and increased as the degree of misregistration 
increased. Also shown was that while the correct classification rate for pure 
(one class) pixels did not change significantly as misregistration increased, 
the number of such pure pixels markedly decreased as misregistration Increased. 
Because of the increased, uncorrectable misregistration in scan-line- 
stralghtened data, the recognition processing for this contract was carried out 
with conic data. Using the conic data, we were able to substantially correct 
for misregistration by selecting a set of 13 SDOs (one for each band) and 
shifting some relative to others such that the maximum misregistration was 
one third of a pixel. 
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In preparation for recognition processing of the agricultural test site 
using conventional techniques, a set of training statistics was extracted 
and the utility of the 13 spectral bands for recognition processing this area 
was determined. Using a computer algorithm which computed the average pairwise 
probability of misclassif ication, the 13 bands were rank ordered with the 
result that the four bands previously identified as having poor signal quality 
were adjudged to be among the worst bands. The two best bands, by far, were 
1.55-1.73 m (SDO 12) and 0.93-1.05 m (SDO 19). The result of classifying 
the agricultural site using conventional techniques and the 7 best bands were 
somewhat disappointing, with accuracies of field center pixels on the order 
of 70%, with confusion noted among in a triad of corn, trees and brush. The 
classification of the data was affected by a combination of the limited 
signal range in the data and the apparent spectral similarity of many of the 
ground classes. The latter effect was attributed to the contrast reducing 
effect of atmospheric haze and the fact tb^.t, at the time of year the data 
was collected, there was a large range of conditions for several classes 
(e.g., some of the corn had tasseled and some had not) leading therefore to 
added spectral similarity among classes. Errors in the proportion estimation 
were also affected by the large number of mixture pixels in the scene. A 
brief study indicated that more than 70% of the scene was composed of such 
mixture pixels. In general a disproportionate number of such pixels were 
classified as corn, resulting in a substantial overestimation of corn in 
the scene. 

The utility of signature extension techniques for S-192 data was tested 
using the Lansing and Ypsilanti sites for training and test, respectively. 
Several signature extension techniques were utilized to process data for the 
signature extension test site located some 70 miles from the signature 
extension training area. The test area was chosen particularly because a layer 
haze covering this site was very evident in the S-190B Imagery; thus, this 
was a test under very different atmospheric conditions as well as a test over 
distance.. Training statistics were gathered using an unsupervised clustering 
technique and clusters for urban, residential, vegetation, water, concrete, bare 
soil and sparse vegetation were generated. A classification attempt without 
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the use of signature extension techniques resulted in poor accuracy while the 
use of signature extension techniques improved classificjition accuracy. The 
best results were obtained using the dark object algorithm. In a qualitative 
sense these results matched thpse obtained using lo ial clusters (i.e., clusters 
generated at the signature extension site) , 

Further classification was carried out on both training sites previously 
mentioned using the unresolved object or mixtures classifier. Such a classifxer 
would seem to be well suited to a data set where more than 70% of the pixels 
were mixture pixels. The results of using this approach on both sites was 
unsatisfactory, due apparently to the previously mentioned limited signal range, 
contrast and spectral discriminability of the data. Thus, no general 
conclusions were drawn with regard to the utility of the mixtures classifier 
on S-192 data. 

Results of this investigation indicate that deficiencies in the S-192 
data will tend to limit its ultimate utility and that to minimize deleterious 
effects of channel-to-channel misregistration the further use of S-192 data 
in conic format is recommended. Furthermore, the design of future multispectral 
scanner and data processing systems should take into account the experience 
gained in processing and analyzing S-192 data. To this end, two recommendations 
are made. First, finer spatial resolution should be considered for future 
sensors; this would alleviate the problems cau'teed by having a large proportion 
of mixture pixels in the scene and the attendant problem of having so few 
pure pixels on which to base training statistics. The second recommendation 
is that future systems provide a means to adjust scanner gain and offset 
parameters to better match the radiance characteristics of individual scenes 
and thus make fuller use of the available scanner dynamic range. For space- 
craft scanners the long atmospheric path traversed by the ground-reflected 
radiation has the effect of adding a sizeable constant radiance (path radiance) 
while also attenuating the radiation resulting in reduced contrast in the data. 
If future scanners are designed with appropriate offset and gain capabilities 
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(indeed, is there a need to set the 7,ero response of a band equal to a zero 
radiance level in that band or rather should it be set close to a zero 
reflectance level) it is safe to say that higher contrast, more useful data, 
would result. As for making specific recommendations regarding spatial and 
radiometric parameters of future scanner systems, such work was beyond the 
scope and context of this investigation. These are very complex areas and 
need to be properly and fully addressed in order to derive more definitive 
recommendations for future spacecraft multispectral scanners. 
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S-192 SCANNER CHARACTERISTICS 
A, SPECTRAL CHARACTERISTICS 


BAND 

SDO 8 

\ (ym) 


1 

22 

.41-. 45 


2 

18 

.45-. 50 


3 

1,2 

.50-. 55 


4 

3.4 

.54-. 60 


5 

5,6 

.60-. 65 


6 

7,8 

.66-. 73 


7 

9,10 

.77-. 89 


8 

19 

.93-1.05 


9 

20 

1.03-1.19 


10 

17 

1.15-1.28 


11 

11,12 

1.55-1.73 


12 

13,14 

2.10-2,34 


13 

15,16,21 

10.2-12.5 


OPTICAL CHARACTERISTICS 

/ 


Instantaneous Fle3.d 

of View 


0.182 mrad 

Scan Rate 



94.79 revs /sec 

No. of Samples/Scanllne/detection: 


Low Sample Rate 

Bands 


1240 

High Sample Rate Bands 


2480 

Analog to digital Conversion 


8 bits/value 

Cone Angle 



5*32V 

Portion of Scan Viewing the ground 

116.25* 

Scan Swath 



72,4 km 


Altitude at time of data collection 
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APPENDIX II 

M-7 SCANNER CHARACTERISTICS [6] 

A. SPECTRAL CHARACTERISTICS FOR MISSION 85M, AUGUST, 1973 


BAND 

1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 


X, (um ) 

.41-. 48 
.46-. 49 
.48-. 52 
.50-. 54 
.52-. 57 
.55-. 60 
.58-. 64 
.62-. 70 
.67-. 94 
1 . 0 - 1. 4 
1.5-1. 8 
9.3-11.7 


OPTICAL CHARACTERISTICS 
Resolution 

Spectrometer (bands 1-9) 

Near IR (Bands 10,11) 

Thermal (9.3-11.7 I4m) 

Scan Rate 

Along track velocity 
Analog to Digital Conversion 
Altitude at time of data collection 2000 ft 
Portion of Scan Viewing Ground 90° 

Scan Swath 4000 ft. 


2.0 X 2.0 mrad 

2.0 X 4.0 mrad 
3.3 X 3.3 mrad 
60 scans/sec. 
2.75 ft/scan 

9 bits/value 
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APPENDIX III 

SOUTHEAST MICHIGAN TEST SITE GROUND TRUTH 


The Southeast Michigan Test site consists of three rural townships, LeRoy, 
Locke and White Oak, in Ingham. County • The location of the test site is given 
in the map. Figure III.l, Michigan State University provided ground truth for 
the three townships. The acreages and number of fields of each ground cover 
class are given in Tables 111.1-111,3. Designations are grouped as follows. 


CORN - corn 


SOYBEANS - 

TREES 

GRASS 

STUBBLE - 
SOIL 

ALFALFA - 


soybeans 

trees, brush, woods 

grasses, Sudan grass, clover, weeds, pasture, 
short grass, tall weeds 

stubble, cut grass, Cut oats, cut wheats cut beans 

soil, bare soil 

alfalfa 


SYMBOL 

OTHER - D - barley 
F - lettuce 
H - hay 
I - onions 
J - orchard 
N - beans 
0 - oats 
W - wheat 

X - homesteads, buildings, towns, freeway 
Y - water, lakes, swamp 
? - unknown, crop?, illegible 


CLOUD COVER - Indicates that the section was cloud covered in the 
high altitude photography which served as a source 
for “ground truth“. 
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LOCATION OF S-192 TEST SITE ON EXCERPT OF ROAD MAP 
OF SOUTHERN LOWER MICHIGAN 












TABLE III.l. GROUND TRUTH FOR LOCKE TOWNSHII, INGHAH COUNTY, MICHIGAN 
GIVEN IN ACRES AND NUMBER OF FIELDS 


LOCKE SECTIONS = 30 






TABLE III. 2. GROUND TRUTH FOR LEROY TOWNSHIP. INGHAM COUNTY, MICHIGAN 
GIVEN IN ACRES AND NUMBER OF FIELDS. 

LEROY: SECTIONS = 29 


SECTION 

# 

COM 

ACRE 

SOYBEAN 
it ACRE 

TREES 
it ACRE 

GRASS 
it ACRE 

STUBBLE 
it ACRE 

it 

SOIL 

ACRE 

ALFALFA 
it ACRE 

( 

it 

)THER 

ACRE 

TOTAL 

if ACRE 

SYMBOL 
FOR OTHER 

2 

1 

41.6 

■ 


5 

193.1 

6 

269.1 

2 

22.8 

4 

13.5 



1 

8.2 

19 

548.3 

Y 

3 

3 

105.4 

H 

14.0 

4 

83.7 

3 

148.6 

4 

104.8 

2 

65.5 

2 

29.2 



19 

551.2 


4 

4 

50.9 

11 

71.3 

4 

180.4 

4 

180.1 

3 

14.6 

2 

11.1 

3 

41.0 

1 

8.8 

22 

558.2 

Y 

5 

8 

203.0 



3 

160.4 

4 

52.9 

3 

104.7 

2 

19.3 





20 

540.3 


6 

7 

205.2 

I 

6.5 

3 

45.7 

4 

66.7 

1 

18.8 

2 

35.7 



2 

21.7 

20 

400.3 

X 

7 

7 

97.2 

3 

69.7 

3 

33.9 

4 

67.4 

6 

114.7 

3 

32.1 



3 

62.6 

29 

477.6 

2N,X 

8 

7 

278.6 



4 

104.1 

7 

159.6 

1 

21.6 

3 

50.4 



3 

29.9 

25 

644.2 

2X,Y 

9 

8 

300.7 

3 

33.3 

5 

84.2 

1 

14.0 

4 

73.7 

1 

27.5 



3 

109.4 

25 

642.8 

x,Y,e 

10 

9 

217.1 



4 

32.8 

4 

96.5 

5 

65.4 

8 

97.1 

5 

96.5 

1 

42.1 

36 

647.5 

X 

11 

5 

82.5 



4 

51.0 

9 

226.7 

5 

51.0 

3 

23.9 

2 

48.0 

1 

157.4 

29 

640.5 

X 

14 

6 

136.1 



8 

80.0 

12 

280.5 



8 

81.1 

2 

10.3 

3 

38.0 

39 

626.0 

2x,e 

15 

8 

231.0 



3 

62.8 

7 

159.6 

4 

35.8 

5 

27.0 

4 

61.2 

2 

47.1 

33 

624.5 

2X 

16 

7 

373.4 

2 

31.7 

5 

69.6 

4 

62.8 

4 

70.3 

1 

10.3 





23 

618.1 


17 

10 

355.8 

1 

11.5 

2 

32.9 

12 

159.2 

2 

23.1 





2 

61.5 

29 

644.0 

6 

18 

6 

54.6 

2 

17.8 

6 

54.1 

5 

207.1 

4 

73.1 

5 

35.2 

2 

19.5 



30 

461.4 


:;i9 

9 

152.8 

3 

42.7 

2 

75.4 

8 

108.2 

2 

62.8 

2 

12.6 

1 

19.6 

1 

24.2 

28 

498.3 

N 

^20 

5 

364.7 



5 

42.3 


64.3 

4 

86.4 

3 

25.3 

4 

36.3 

2 

30.0 

30 

649.3 

8,X 

21 

3 

269.1 



1 

194.8 


53.7 

2 

73.2 

1 

25.4 



2 

28.8 

12 

645.0 

20 

22 

6 

200.4 

2 

59.3 

4 

98.5 


133.8 

3 

78.9 



1 

69.1 



21 

640.0 


23 

8 

256.3 



3 

70.8 

11 

155. C 

4 

60.5 

3 

91.7 

1 

10.3 



30 

644.6 


26 

6 

300.0 


14.9 

3 

189.5 

8 

84.2 



6 

59.7 





24 

648.3 


27 

7 

212.3 


26.4 

3 

16.2 

12 

148.2 

2 

39.7 

7 

95.4 

4 

72.7 

1 

20.1 

39 

631.0 

N 

28 

12 

221.8 


57.6 

6 

127.3 

8 

141.1 

2 

20.1 

7 

54.2 

2 

20.8 



39 

642.9 


29 

12 

230.1 


91.4 

3 

55.9 

9 

170.4 

2 

25.3 

3 

63.3 



1 

9.8 

34 

646.2 

0 

30 

11 

201.8 


14.4 

2 

41.9 

mm 

127.8 

5 

51.6 

7 

i 64.3 

3 

20.1 

1 

10.3 

37 

532.2 

9 

31 

5 

183.8 


23.6 

3 

99.6 

B 

72.1 

2 

22.6 

4 

89.9 

1 

36.8 

1 

19.0 

22 

547.4 

I) 

32 

9 

243.8 

2 

23.1 

^ 3 

51.9 


95.7 

4 

87.4 

1 

2.9 



5 

148.2 

30 

653.0 

N,20,B,H 

33 

11 

239.6 

6 

72.6 

3 

103.6 

B 

60.7 

8 

102.1 

3 

17.2 

2 

8.6 

2 

16.1 

41 

620.5 

N,X 

34 







B 












Cloud Govei 

35 

8 

374.2 

1 

17.3 

^-3 i 

87.0 

B 

79.7 

1 

11.5 

4 

47.3 

1 

29.3 



22 

646.3 


TOTAL 

208 

S183.8 

40 

699.1 

ao7 

2523.4 

185 

3645.7 

89 

1516.5 

100 

1178.9 

40 

629.3 

38 

893.2 

807 

17269.9 


AVE. 

7.2 

213.2 

1.4 

24.1 

3.4 

87.0 

6.4 

125.7 

3.1 

52.3 

3.4 

40.7 

1-4 

21.7 

1.3 

30.8 

27.8 

595.5 
























TABLE III. 3. GROUND TRUTH FOR WHITE OAK TOWNSHIP INGHAM COUNTY, MICHIGAN 
GIVEN IN ACRES AND NUMBER OF FIELDS 

WHITE OAK: SECTION = 29 


SECTION 

CORK 

it ACRE 

SOYBEAN 
it ACRE 

TREES 
it ACRE 

GRASS 
it ACRE 

STUBBLE 
it ACRE 

SOIL 

it ACRE 

ALFALFA 
it ACRE 

OTHER 
it ACRE 

TOTAL SYMBOL 

it ACRE FOR OTHER 

2 

9 

205.1 



r 

6 

38.0 

12 

350.0 

2 

29.9 

6 

49.5 

2 

66.8 



37 

739.3 


3 

14 


1 

38.6 

3 

110.0 

11 

118.8 



8 

99.7 

1 

5.7 

1 

4.1 

39 

696.8 

X 

4 

7 


2 

19,7 

9 

99.7 

8 

160.7 

5 

146.1 

4 

32.2 

1 

17.3 

1 

8.7 

37 

693.0 

N 

5 

5 

103.1 

4 

100.2 

2 

120.9 

10 

137.1 

10 

125.3 





1 

106.0 

32 

692.6 

N 

6 

18 

222.1 

5 

59.3 

4 

69.7 

3 

21.7 

7 

86.6 

9 

119.6 

1 

21.9 

1 

6.9 

48 

607.8 

0 

7 

6 

139.4 

1 

26.5 

4 

189.4 

3 

45.0 

3 

39.7 

5 

35.2 

2 

81.9 

1 

19.7 

25 

576.8 

9 

8 

5 

178.0 

2 

27.0 

6 

94.3 

6 

214.8 

1 

28.8 

3 

21.2 



5 

79.0 

28 

643.1 

H,4K 

9 

8 

293.3 

2 

47.8 

6 

73.0 

4 

42.9 

2 

38.0 

5 

94.0 



3 

57.1 

30 

646.1 

!C,2H 

10 

7 

235.5 



5 

141.1 

5 

88.0 

1 

11.5 

4 

32.1 

3 

mm 

3 

10.8 

28 

628.3 

3X 

11 

2 

72.6 



4 

292.8 

5 

224.6 



2 

61.7 

1 

7.5 



14 

659.2 


14 

3 

91.0 



6 

156.0 

12 

337.7 

2 

36.8 

1 

25.4 





24 

646.9 


15 

9 

267.5 



4 

69.1 

3 

65.6 

1 

88.7 

6 

119.8 

1 

4.1 

2 

13.9 

26 

628.7 

d,X 

16 

12 

307.5 

3 

18.9 

8 

89.4 

9 

172.3 

1 

19.0 

3 

28.2 





36 

635.3 


17 

8 

159.5 

2 

43.2 

5 

150.5 

5 

55.4 

6 

62.1 

9 

74.8 



6 

91.0 

41 

636.5 

?,SH 

m18 

11 

247.3 



9 

99.7 

5 

74.3 

4 

76.7 

3 

55.9 

1 

11.0 

2 

22.4 

35 

587.3 

e,x 

ol9 

8 

201.3 

1 

39.8 

4 

76.5 

3 

45.4 

4 

128,9 

1 

29.3 

1 

45,5 

1 

2.9 

23 

569.6 

X 

20 

8 

207.3 

2 

89.2 

6 

100.2 

9 

168.7 

4 

28.3 

2 

6.8 



3 

42.0 

34 

642.5 

N,X,J 

21 

7 

192,4 

2 

24.7 

9 

80.7 

6 

236.8 

4 

36.3 





2 


30 

641.8 

N,9 

22 

12 

233.4 

2 

71.9 

5 

30.5 

7 

78.3 

6 

121.6 

5 

100.7 

1 

7.5 



38 

643.9 


23 

2 

29.5 



2 

90.5 

8 

196.8 

2 

30,5 

3 

88.8 



8 


25 

632.5 

Y,I,5F 

26 

2 

62.7 

1 

7.5 

4 

347.9 

9 

192.8 

1 

16.2 

1 

1.8 



2 

11.0 

20 

639.9 

2Y 

27 

11 

149.9 

2 

29.9 

6 

130.5 

12 

161.9 

8 

83.5 

8 

59.3 

1 

8.7 

4 

12.1 

52 

635.8 

0.3X 

28 

5 

110.0 

1 

26.5 

7 

137.7 

4 

176.8 

6 

93.2 

6 

71.2 

3 

36.0 



32 

651.4 


29 

4 

277.7 



1 

46.1 

6 

206.8 

4 

39.1 

2 

35.2 

1 

12.6 

2 

31.1 

20 

648.6 

H.X 

30 

8 

153.3 



7 

107.7 

12 

127.7 

5 

59.3 

5 

47.1 

2 

32.8 

5 

58.5 

44 

586.4 

3H.2X 

31 



















Cloud Cover 

32 

12 

242.5 



3 

52.4 

6 

131.0 

.^1 

39.1 

4 

23.7 

7 

127.8 

1 

4.1 

37 

620.6 

X 

33 

9 

139.2 

1 

15.5 

9 

135.4 

7 

143.3 

7 1 

82.8 

3 

37.4 

1 

4.6 

5 

76.0 

42 

634.2 

?,W,X,2H 

34 

11 

199.5 



9 

142.3 

11 

157.7 

10 

99.1 

3 

13.3 

1 

4.6 

3 

21.9 

48 

638.4 

0,2X 

35 

7 

184.3 

1 

6.9 

7 

146.2 

7 

141.6 

4 

72.6 



4 

72.5 

1 

11,5 

31 

635.6 

J 

TOTAL 

230 

5433.4 

35 

693.1 

L60 

3418.2 

208 

4274.5 

114 

1719.7 

111 

1363.9 

35 

878.1 

63 

958.0 

956 

18538.9 


AVE. 

7-9 

187.4 

1.2 

23.9 

).5 

117,9 

B 

147.4 

3.9 

59.3 

3.8 

47.0 

1.2 

23.4 

2.2 

33.0 

33.0 

639.3 
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If a field was listed as 1/2 one crop and 1/2 another, it was treated 
as if it were 2 separate fields; but if it was listed as vroods pasture or 
weeds and brush, it was placed under the category first mentioned. However, 
weedy soybeans were called soybeans. Since fields with dual crop identifi” 
cation were arbitrarily classified by the first designation, there may be a 
bias in the results. This bias is likely to be significant only for the 
GRASS and TREES categories. 

Table III. 4 totals the information from the previous tables. The 
percentage of fields belonging to each ground cover class do not differ 
significantly between townships. However, the percentage of the total acreage 
is significantly different for corn and grass. Corn covers 35.8 percent of Leroy 
Township but only 26.0 percent of Locke while grass ranges from 21.1 percent 
in Leroy to 31.9 percent in Locke Township. The major ground cover classes, 
in order of decreasing Importance according to the percent found in the test 
site, are listed below: 


Com 

30.3% 

Grass 

25.5% 

Woods 

16.8% 

Stubble 

9.4% 

Bare Soil 

7.2% 


All other ground covers represent less than 5% of the total acreage of the area* 
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TABLE III. 4. PERCENTAGE TOTALS OF ACREAGES AND NUMBER OF FIELDS FOR VARIOUS 
GROUND COVER CLASSES FOR EACH OF THE THREE TOWNSHIPS AND FOR 
THE ENTIRE TEST SITE. 




LOCKE 



LEROY 



WHITE OAK 


TOTALS 



% OF 

TOTAL 

FIELDS 

% OF 
TOTAL 
ACREAGE 

AVERAGE 

ACREAGE 

% OF 

TOTAL 

FIELDS 

% OF 
TOTAL 
ACREAGE 

AVERAGE 

ACREAGE 

% OF 

TOTAL 

FIELDS 

% OF 
TOTAL 
ACREAGE 

AVERAGE 

ACREAGE 

% OF 

TOTAL 

FIELDS 

% OF 
TOTAL 
ACREAGE 

AVERAGE 

ACREAGE 

CORN 

24.0 

26.0 

21.7 

25.8 

35.8 

29.7 

24.0 

29.3 


24.6 

30.3 

. 24.9 

SOYBEAN 

4.5 

3.8 

16.6 

5.0 

4.0 

17.5 

3.7 

3.7 


4,3 

3.8 

17.8 

TREES 

15.7 

17.2 

21.9 

13.3 

14.6 

23.6 

16.7 

18.4 

21.4 

15.3 

16.8 

22.1 

GRASS 

28.3 

31.9 

22.6 

22.9 

21.1 

19.7 

21.8 

23.1 

20.6 

24.4 

25.5 

21.1 

STUBBLE 

.12.0 

10.1 

16.9 

11.0 

8.8 

17.0 

11.9 

9.3 

15.1 


9.4 

16.3 

SOIL 


7.3 

14.1 

12.4 

6.8 

11.8 

11.6 

7.4 

12.3 


7.2 

12.7 

ALFALFA 


2.0 

20.5 

5.0 

3.6 

15.7 

3.7 

3.7 

19.4 

3.5 

3.1 

18.0 

OTHER 


1.7 

11.0 

4.7 

5.2 

23.5 

6.6 

5.2 

15.2 

4.8 

4.0 

16.7 

TOTAL 



21.4 '] 

_ — 1 


20.0 



19.4 



20.2 
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APPENDIX IV 

DERIVATION OF CROSS-CORRELATION FOR MISREGISTRATION STUDY 

The following Is a procedure for determining the amount of misregistration 
between two correlated data channels. By reconstructing the continuous waveform 
over a lengthy interval in both channels, the cross-correlation function of the 
two waveforms can be determined. Let f(t) and g(t) denote the reconstructed 
waveforms in the two channels over the interval [A,C]. The cross-correlation 
function r(t ) is defined as 


C 

r(t^) - I f(t)g(t + t^) dt 
A 


The amount of misregistration between the two channels can be estimated as the 
value of the parameter t which maximizes the cross-correlation. The continuous 
waveforms can be reconstructed from the sample values by making assumptions 
which allow the use of Shannon’s sampling theorem. The sampled data is converted 
into continuous form to allow the misregistration to be estimated to within a 
fraction of a pixel rather than in whole pixel increments. The length of the 
interval [A,C] must be long in comparison to the range of the parameter values 
t . This condition is required to minimize the effect of inaccuracies which 

O ' ' ' ' ' , , ' ' 

will occur near the endpoints of the interval. 

Shannon’s sampling theorem indicates that a continuous signal y(t), 
bandlimited to B(radians/sec) , can be exactly reconstructed from samples 
taken with a sampling interval T = tt/B. The sampling rate is equal to 
twice the highest frequency component contained in the signal. The original 
signal y(t) can be expressed in terms of the sample values y(mt) as 


r _ , Y sin B(t - mT) 

y(t) = I B y(m ) B(t - mt) 

m=-<» 


Assume that the tvjo continuous data channels f(t) and g(t) are bandlimited 
to B and that the sampling interval T is equal to /B, Let the vample values 
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of these two waveforms over the interval [A,C] be denoted as f(kx) and g (it) 

i, k - 1 , , N. The cross-correlation r(t ) can be expressed in terms 

o 

of the samples as 


N 

= 2 f(kT)g(iT) 

° i,k=l 


C 

f sin B(t - kx) sin B(t + t^ - ix) 
B(t - kx) B(t + t^ - ix) 


Using a variation of Parseval^s Theorem, the integral can be evaluated by 
extending the limits of integration to positive and negative infinity, and 
r(t^) can be expressed as 


N 

r(t ) = Bit 2 ) 

i,k=l 


sin B(kT - iT + t^) 
B(kT - iT + t^) 


or, since Bx = tt 


N 

r(t ) = Btt 2 f(k )g(i ) 
° i,k=l 


sin TT(k - i + 
jr(k - i + ^ 


This relationship can be expressed in terms of a fraction of a sampling interval 
(or fraction of a pixel) by defining a variable A = t /x. Then 


r(A) = Btt 

i 



f (k )g(i ) 


sin 7r(k - i A) 
7T (k - i + A) 


Neglecting the constant factor Btt and expressing f (kx) and g(ix) as f^ and 
g^, respectively, the function d(A) must be evaluated, where 


■*<« ■ ?k 

x,k=l 


sin Tr(k - 1 + A) 
Tr (k - i + A) 
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which can be simplified as 


d(A) 


S f I 

j=-(N-l) I j-k-j 



sin IT (A - i) 
ir(A - j) 


For large N, the variable j need not extend over the entire range because of 
the insignificant contribution of the high magnitude terms. To reduce the 
effects of noise ) the function d(A) should be determined for several scan lines 
and averaged . 

Initial tests of this algorithm indicated that the misregistration estimate 
was being biased by the DC (average) component of the signal in each channel 
To remove this bias, the algorithm was modified to subtract out the mean value 
of each channel before computing the cross-correlation. In essence, this 
means that the cross-correlation between the AC (varying) components of the signals 
was then computed and this modification removed the bias that had been noted. 
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APPENDIX V 

DIGITIZATION AND PREPROCESSING OF M-7 DATA 

The data set selected for processing was Run 2 which was collected over 
Flight line 1 of the intensive study site at an altitude of 2000 feet. The 
time of data collection was 1100 hours EDT, approximately the same time as 
SKYLAB overpass. 

First the analog tape was duplicated to remove relative skew (misregistration) 
between channels, and the tape was reviewed in regard to data quality. The scan 
rate was checked and found to be 60 cycles per second as per specifications. 

The relative ground speed of the aircraft was checked and found to be 
approximately 2.75 feet/scan or 98 knots. Each data channel was checked. The 
only problem found was in the thermal channel, track 12, where the offset was 
very noisy, with variations in the cold plate signal of as much as 15% of the 

total dynamic range. 

As mentioned above, the data were gathered at an altitude of 2000 feet. 

This means that the ground size of each resolution element is very small 
compared to the size of ground objects of interest; or conversely, that each 
ground object of interest would contain an enormous number of resolution 
elements. For example, the spectrometer on the K-7 scanner exhibits a 
resolution of two milliradians, resulting in a resolution element of four 
feet by four feet. A typical 15 acre agricultural field would be scanned 
by as many as 40,875 resolution elements. 

Accordingly, it was felt that we could take advantage of the gross 
redundancy in the data by means of spatial filtering to improve the signal 
ta noise ratio of the data, and decrease considerably the number of pixels to 
be processed, thus decreasing processing time and costs. Naturally, some 
information, such as the ability to more precisely locate boundaries between 
two areas or detect fine-scale structure in the data, would be lost in using 
such filtering. For this data set, it was felt that such drawbacks would not 
hurt the analysis effort. Accordingly, it was decided to filter along each 
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scan line using an appropriate low pass analog filter and sampling once every 
20 milliradians . In addition digital smoothing over 9 scan lines was used at 
each scan point. The result was one digitized "average” datapoint from every 
10 X 9 rectangle of data points in the original analog tape. This represents 
an increase in signal to noise of 9.5:1 and a large decrease in the volume of 
data output. In addition, this sampling scheme allowed the data to be digitized 
eight times faster than it could have been done had we digitized every point 
in the scan line. In all, some 40,000 analog data scan lines (representing 
approximately 21 miles on the ground) were digitized. Each digitized scan line 
consisted of 85 points of ground scene, and an additional 55 points of calibration 
information. 

After digitizing, the data were again checked for any unusual problems (noise, 
skew between channels, dropouts, etc.); none were found. The data were then 
dynamically clamped to the zero signal reference source (cold plate for the 
thermal channel and dark level for the other channels), i.e., processed to reduce 
any changes in the offset of each channel by calculating for each scan line 
the average values of the reference area for each of the channels, then 
subtracting these values from all points in the scan line. 

The preprocessing stage was completed by application of the average signal 
versus angle data transformation [16]. In this method, for each channel, the 
average signal at each discrete scan angle (pixel) is calculated and the resulting 
function analyzed. The average signal function in all channels was quadratic 
in form. The data were corrected by dividing the data values by the corresponding 
value of the correction function. 

The output data were then used in the training and classification stages 
described in the text. 
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APPENDIX VI 

FIELD LOCATION IN S-192 DATA 

As a first step in identifying individual fields, graymaps were generated 
for several of the bands which displayed good contrast and homogeneous areas, 
however, it was not possible to accurately locate individual fields. Even 
geographic features such as roads are not defined clearly enough to be of use 
in matching the ground information to the graymaps. 

Since fields could not be located by inspection of graymaps, a semi- 
automatic procedure employed which made use of an x-y coordinate digitizer which 
efficiently digitizes the coordinates of points where a cursor has been 
momentarily positioned. All points of interest, section corners, field corners, 
etc,, were located on large scale photography. Points digitized for Skylab 
processing were located on olack and white enlargements of imagery acquired 
by the U-2 overflights in mid-August, 1973, 

To transform the photographic (x,y) coordinates into (scan line, and scan 
point) coordinates, control points which could be found with confidence on the 
graymaps as well as on the photographs were used. Being unable to find such 
obvious control points as roads or road intersections, bodies of water were 
used for control points. Comparison of a signature for a deep water lake and 
a general vegetative signature indicated a large separation of signals 
in SDO’s 17 and 19. Therefore a two-channel classification for 
water was performed; all points so classified were indicated on a scan-line- 
straightened graymap. These points were compared to U-2 and S-^190A false 
color IR imagery to ascertain their precise place in the scene and finally were 
located on the enlarged U-2 photographs, 

A transformation was calculated using the control points and regression 
techniques. The digitized points were then mapped from the photography (x,y) 
coordinates to scan line, scan point coordinates for scan-iine-straightened 
data. The best-fit regression for the Skylab conversion yielded a first order 
equation with no cross terms , 
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These coordinates were then converted to conic data coordinates. The 
appropriate transformation was calculated by u.gain defining a set of control 


points and by using the inverse of the scan line straightening transformation 
equations as given in the EREF Users Handbook, coupled with regression 
techniques to accurately calculate the constants in the equations. 

The equations we used were: 


CONIC POINT = A 



where 

P = [STRAIGHT POINT - 517.8-0.5] 

N = 1239 Points /Conic Scan Line 
e = 116.25° Field of Scan 

A & B are constants estimated from regression techniques. 


Similarly, for scan lines: 


CONIC LINE = C + D • STRAIGHT LINE 


- E*R COS 


(CONIC POINT * 2 - 2 - N)0 
2 N 


with 

R = Radius of the scan circle projected on the Earth 
R ^ 608 pixels 

and C,D, and E are constants estimated from regression techniques. To 
perform the regression, 18 points were located on both conic and straightened 
graymaps. The regression fit was very good and further, all 5 coefficients 
seemed to be sensible, a reflection of the physical reality. 
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With the field coordinates converted, the ground information was merged 
with the conic data. Graymaps of two conic data channels and the ground 
information channels were overlayed for comparison and the conversion was 
deemed very satisfactory. 
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APPENDIX VII 

DESIGN OF THE EXPERIMENT TO ASSIST IN THE 
ANALYSIS OF THE EFFECTS OF CHANNEL-TO-CHANNEL SPATIAL 
MISREGISTRATION OF S-192 DATA ON "FIELD-CENTER" PIXELS 

An integral part in the evaluation of the effects of misregistration of 
S-192 data is an investigation of the effects on field center pixels that remain 
field center in all channels even after misregistration. The following outlines 
the experiment designed to assist in this analysis. Since the analysis was based 
on a simulation of the effects of misregistration, the base signatures were 
extracted from the corrected conical S— 192 data set which was assumed for purposes 
of simulation to be perfectly registered from channel-to-channel . 

Step 1. Choose a signature set. 

Five S-192 field center signatures were chosen representing the 
predominant scene classes: corn, tree, grass, brush, and bare soil. 

A subset of seven S-192 SDOs were used (SDOs 2, 8, 10, 12, 17, 19, 20). 

Step 2. Choose a subset of n channels to misregister in simulation 

There were two phases to this step in the experiment. Initially three 
channels, SDOs 2, 12, and 17 were chosen to be mlsreglstered . These three 
SDOs were chosen because they were found to be the three best channels 
tor purposes of discrimination in the least-probablllty-of-misclassification 
sense. Next, in a parallel experiment, only SDO 12 was mlsreglstered. It 
had been determined to be the best single SDO for purposes of discrimination. 
Step 3, Choose varying degrees of misregistration to simulate. 

Each of the channels described in step 2 were mlsreglstered in 

simulation by fixed amounts of 1/3, 1/2, 2/3 and 1 full pixel. 

Step 4. Run a Computer Program to calculate simulated field-center signatures 
for each degree of misregistration determined in Step 3. 

The simulation model used is described in section 4.3.2. A computer 
program was written to implement the algorithm simulating the effects of 
chanael-to-channel spatial misregistration of field center pixels. This 
program was run to produce four sets of signatures with three mlsreglstered 
channels and four sets of signatures with one channel mlsreglstered. 

Each set represented a different degree of misregistration. 
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Step 5. Calculate an expected performance matrix for each set of signatures. 
The program PEC was used to calculate these matrices. PEG is fully 
described in Appendix XI. The program was run for each set of 
signatures simulating effects of misregistration along with the original 
”registeted” signatures, 

Step 6. Analyze the results in light of the analytical expectations. 

The. performance matrices were analyzed as is described in 
Section 4.3.4. 
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APPENDIX VIII 

DESIGN OF THE EXPERIMENT TO ASSIST IN THE ANALYSIS 
OF THE EFFECTS OF CHANNEL-TO-CHANNEL SPATIAL 

MISREGISTRATION OF S-192 DATA ON "BORDER” OR "MIXTURE" PIXELS 

The following experiment was implemented to determine the effects of 
misregistration on mixture pixels. Since the analysis was based on a 
simulation of the effects of misregistration, the base signatures were extracted 
from the corrected conical S-192 data set which was assumed for purposes of 
simulation to be perfectly registered from channel- to-channel . The analysis 
carried out pertains only to those pixels that are mixture pixels in some 
channel (s) after misregistration. 

Step 1. Choose a signature set. 

Five S-192 field center signatures were chosen representing the 
predominant scene classes! corn, tree, grass, brush, and bare soil. 

A subset of seven S-192 SDO’s (2, 8, 10, 12, 17, 19, 20) were used. 

Step 2. Choose a subset of n channels to misregister in simulation. 

There were two phases to this step in the experiment. First three 
channels, SDOs 2, 12, and 17, were used. These three SDOs were chosen 
because they were found to be the three best channels for purposes of 
discrimination in the least-probability-of-misclassification sense. 

Next, a single SDO, 12, was used. It had been determined to be the 
best SDO for purposes of discrimination. 

Step 3. Choose varying degrees of misregistration to simulate. 

Each of the channels described in Step 2 were misregistered in 
simulation by fixed amounts of 1/2 and 1 full pixel in the east to west 
direction. 

Step 4. Run a computer program to calculate simulated field center 

signatures for each degree of misregistration determined in Step 3. 

The simulation model used is described in section 4.4.2. A computer 
program was written to implement the algorithm simulating the 
effects of channel-to-channel spatial misregistration on field center 
pixels. This program was run to produce six sets of signatures, three 
for each parameter setting of channels to be misregistered. 
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Step 5. Choose varying proportions of mixtures of two ground covers to simulate. 

A distribution of mixtures of two ground covers A and B were to be 
simulated in proportion of 2/3 A and 1/3 B, 1/3 A and 2/3 B. 

Step 6. Simulate mixture distributions in the proportions chosen in Step 5 
for all possible pairs of registered field center signatures chosen 
in Step 1. 

The program discussed in step 4 was optionally run to simulate these 
mixture distributions in the proportions described in Step 5. These 
mixtures represented the actual distributions expected to be found in 
the S-192 data set under the assumptions of the model used. 

Step 7. Simulate mixture distributions in the proportions chosen in Step 5 
for all possible pairs of misregistered field center signatures 
for each misregistration chosen in Step 3. 

For one-half pixel misregistration, twelve distributions were 
simulated for each of the field-center misregistered distributions 
calculated in Step 4. For one pixel misregistration, twenty signatures 
were simulated for each of the base signatures. The difference in 
the number of simulations lay in the fact that, for a greater degree 
of misregistration, more field center pixels would be mixtures in the 
misregistered channels. Hence more distributions were simulated 
in order to better represent the situation. 

Step 8. Calculate an expected performance matrix for each degree of 
misregistration. 

Using the program PEC, three performance matrices were calculated, one 
for each misregistration of 0, 1/2 and 1 full pixel. Program PEC is 
described in Appendix XI. The field center signatures simulated in 
Step 4 were used as recognition classes. Linear decision boundaries were 
determined based on these signatures. Then the signatures simulated in 
Step 7 were used as the scene classes and expected performance probabilities 
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were calculated for each of the simulated distributions. 

Step 9. Plot the results. 

Grpphs were generated displaying the probability of classification 
of a ground cover a s a function of the mixture and misregistration. 
Step 10. Analyze the results. 

The plots were analyzed as a described in Section 4.4.5. 
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APPENDIX IX 

A SIMPLE ANALYTICAL MODEL TO STUDY THE EFFECTS OF 
MISREGISTRATION ON FIELD CENTER CLASSIFICATION ACCURACY 

Insight has been gained into what effects spatial misregistration may 
have on field-center classification accuracy through an analytical analysis 
of the problem. Consider two normal distributions in n channel?, 
and N (y , R) , with a common covariance R. The probability of a type-one 
error* using the linear decision rule is 

$[l/2(yV^y)^] (IX-1) 

where 

“ _ 1 y2 

$ (x) = — I e ^ dy (IX-2) 

* 

X 


and P “ the channel to channel mean difference. 

A B 

Studies have indicated that misregistration from channel to channel, or 
time period to time period in the case of raultitemporal analysis, causes resultant 
signatures to be less correlated. This analysis, therefore, attempts to examine 


the error rate ^ as a function of correlation p 



*Under the assumption of common covariance, type-two error is equivalent to 


to type-one error. 
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Also, let f(p) = \ for -1 < p < 1 (IX-4) 

and g(p) = l/2f(p)^^^ 

f (p) -V 00 at p = ± 1 


Similarly g(p)“^« at p = ± 1, which implies x ^ ® at p = ± 1. ^ can be 
expressed as a function of p through f(p) and/or g(p); 

= $[1/2 f(p)^^^] = $[g(p)l. 

Substituting x = » into E^, IV-2 we have I(“) * 0, We have established 
therefore, that the error rate $ is minimized for correlation p = ± 1, Let 
us now examine the behavior of the function $ for ~1 < p < 1. 

Although restricting ourselves to two channels we note that the following 
analysis can be generalized for , the correlation between any pair of 
channels i and j. 

Let us now calculate the first derivative of f(p): 


f(p) = 


MM. - ^ 

d(p) ^ dp 


-1 


(lX-5) 


We can simplify the calculation of by noting the following relationship 


between and ^ : 

dp dp 


- ir (I) - 0 - (fr) ^ 


dRv „-l 


,dR 


-1 


dp 


dp 


dp- 


dp 


dR 


-1 


Solving for ~2p ~' fiod: 


dR 


-1 




(IX-6) 
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Substituting Eq. IX-6 into Eq. IX-5 and solvingi 


df(p) _ t„-l dR D-1 

Noting that ^ = I(R and (R ^)** = R ^ 


5^ . - 4£ (R-i„) 


dp 


(IX-7) 


Eq IX-7 is an expression for the first derivative of f(p) in terms of 
the derivative of R, Now let us examine if, for -1 < p < 1, critical values 
of f(p) exist. Individually examining the components of equation (IX-6) 
determine the following expression for two channels. 


R • 


Ol^oi^ (l-p2) \o^h2 


- paj^02’'^2 


EC, 


(:s) 


(IX-8) 


dR _ 

dp ■ ^1^2 


0 1 

1 0, 


'0 1 


.1 0 


(IX-9) 


now substituting ix-8 and IX-9 intoIX-7; 


i£i£i - _c a ) c 



(IX-10) 


df (p ) = “ 2b.aj,a2 ; b > 0 

dp \ 1/ 


(IX-11) 


For ^ = 0, either a^^ or one of the two rows of R~^ , must equal zero. 
Since $(x) > 0 and continuous, and has minima defined at p = ± 1, then $ 
is maximized at 






P 2<^1 


for -1 < p < 1. 
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Before examining the implications of this result, let us determine 
whether this result can be generalized. 


Eqs. IX-5 to IX-7 can be generalized by letting p = and 

dR . 3R . r 1. , . . 

^ = 3 ^— for any pair of channels i and j. 

Hence ; 


and 




3p 




ap 


ij 


Examining a three-dimensional case, 

2 


Pl2°l®2 Pl3®1^3) 


*' l 2 ‘' l *’2 '^2 


^ 23 ° 2° 3 


^^ 13 ° 1®3 ^ 23 ° 2*^3 ®3 


Therefore: 


3R 

3p 


12 



3R 


3p 


13 


0 

0 

0 


3p 


23 


0 

0 

1 


0 

1 

0 


Following the same line of reasoning as in two dimensions [Eq. IX-10 and IX-11 
we find that 


^^^Pl2^ -1 

= 0 when either the 1st or 2nd row of R p is zero; 


9 P 


12 


similarly for f(P 23 > and 
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We can now generalize to conclude 


= 0 at some p , in the interval defined by -1 < p . . < 1 

ap. . ci,j 

for any pair of channels i, j. The point can be calculated exactly by setting 

the i*^^ or row of R~^y equal to zero and solving for The functxon f 

is a function of many variables, f (pj^ 2 »Pi 3 » • * • for all i,j. We have 
determined that (1) the function i is minimized along its boundary in the 
interval -1 1 P < T and (2) the function f has a critical point at j 
with respect to each variable p^ j for all i and j and these critical 
points must be maxima. Under th4se conditions we can conclude -that the 

function $ reaches a maximum on the interval ~1 < ^ ij ^ * 

Let us now examine the implications of this analysis graphically for 

two channels of data: 


CASE 1 CASE 2 CASE 3 





FIGURE IX-1 ERROR RATE OF RECOGNITION 4, AS A FUNCTION 
OF CORRELATION p IN FIELD CENTERS 


Figure IX-1 displays possible curves mapping the error rate $ in field centers 
as a function of p . A maximum error occurs at p^. 4 is minimized at p = ± 1 
and intercepts the y axis at p=0, f(0) = f(0) *= ^2 *^1 * 


p^ occurs at 


P2®1 

or • 

P2O1 ^1^2 
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y?RiM 


Let be the correlation of a registered data set in two channels 

and let p be the correlation of the same data set but misregistered to 
m 

varying degrees. Keep in mind that misregistering data will cause the 
correlation to decrease. Let us examine each case depicted in Figure lX-1 
separately, 

CASE 1 

(1) if 0 < < Pj. < then misregistering the data set would cause 

the error rate to increase until = p^, then it would restore accuracy 

somewhat until p =0, 
m 

(2) if 0 < p^ < p^ < 1^ then misregistration would actually Improve 
results • 

(3) if -1 < p < 0, then misregistration would cause the error rate 

r ~ 

to Increase, 

(4) if 1, misregistration would always improve field center results. 

CASE 2 

(1) if -1 < p < p <0, this behaves as case 1 step (1). 

r ^ c 

(2) if -1 < p^ < p^ < 0, see case 1, step (2). 

(3) if 0 < p < 1, see case 1 step (3). 

(4) if p^ ^ -1, see case 1, step (4) 

CASE 3 

In this case misregistration would always cause the error rate to 
increase . 
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APPENDIX X 


DERIVATION OF CORRELATION ESTIMATION MODEL FOR TWO 
CHANNELS MISREGISTERED WITH RESPECT TO ONE ANOTHER 

Crop 1 I Crop 2 



resolution in 
channel k 


FIGURE X-1. CONFIGURATION OF BOUNDARY RESOLUTION 
ELEMENTS OF TWO CHANNELS OF DATA MIS- 
REGISTERED WITH RESPECT TO ONE ANOTHER 

In the derivation of the covariance estimation model, we restrict ourselves 
to two channels of data and two crop types. Figure X-1 illustrates a possible 
configuration of boundary elements for two channels misreglstered with respect to 
one another. It is the cross-correlation between two such channels that we are 
interested in calculating. 

Let S. (a,3) be the signal per unit area from ground coordinate (a, 3) for 

til 

the i crop, j channel. This signal is assumed to originate from a 
stationary random process, with statistics: 


E[S^j(a,e)] - Ay 

E{[Sy “Ay] ~^hk^^ 

= 6(i,h) ryj^ (aj^-a2,ej^-e2> 


6(i,h) is Kronicker’s Delta Function, If ij^k, i.e, two different 

crops, correlation is assumed to be asera 

r . (a--oi«,3--3rt) is the correlation function and is dependent on the 
ij ^ X z X A 

distance between the locations on the ground* 


162 



formerly willow run laboratories. 


THE UNIVERSITY OF MICHIGAN 


The assumption made is that the correlation between two pixels drops rapidly as 
the distance between two pixels increases. The correlation between two adjacent 
pixels is assumed to be zero • 

The scanner signal in the channel is the sum over the resolution area 
of all signals Sy(a,g): 


0 


Xj = I da d$S^j(a,3) + 


b, d. 

.2 .2 


da 


a. cj 


de (ot, 


e) 


with statistics: 


- 


d. 

.2 


j da I + I da | 


a. c, 

3 J 


d. 

,3 




c. 

3 


‘j 


X - E(x.) = da [ dels„(a,e) | f 3) “^2^3 

3 3 ^ V n o 


"j "j 


the correlation between channels j and k is: 




0 


d^i j ^6ilSy<aj 6i)-Ayl + 




da^ 


de2lS2j(«2,M - 


0 






f daj^ [ de^ + f d-a ] 

J J - 
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multiplying this expression out we note that cross terms drop out due to 
Kronecker’s Delta: 



0 

r 

«i4 0 

fj r fK 

K-41, 

jk 

dot 

1 J J '“•2 J «2 'uk<V“: 



\ 

1 

i 


d. h, d. 

fJ fk fk 

+ ! 

J 


dSj dkj d6j <V“2' 

( 

) 

C. 0 C, 


To simplify the algebra let and d^=dj^. This means that only misregistration 

in one direction is considered t We will generalize later to two directions* 

Using this assumption along with the identity* 


0 0 d-c 

“jk " |'*“i 1 1 '«k<V“2-») A - 

a. a. -(d-c) \ / 


^k 

I d.j(d-c) I ry^(a^-.j.S) A - M) 

0 -(d-c) \ / 


dB 


‘j 

H- Jda^ 
0 


dB 


let F 


d-c 

Uk ■ I'ljk <V“2’« (ijifl-) 

-(d-c) • \ / 


dB 


and similarly 


d-c 

2jk * |'2jk <“l-»2-« P-M) 

-(d-c) \ ! 


dB 


* Simplified using the identity 

B B B-A 

(B-A) |F(x) 1- )jJ_ dx 

A A -(B-A) 


a 

11 


F(u-v) du dv 
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substituting we have : 
0 0 


^4 K 

j fk 


R 


■jk 


jda^ |'i“2 ^ljk^“r“2) + . j^“2^2jk^ V“2^ 


(X-1) 


-k 


0 0 


now examine each component of assuming that a j < aj^ (the same argument 
applies otherwise). 

0 0 
jdoi |d02 
^k 

M r\ r\ 

% 


jdOi jda2 Fyk + I**"! |*^“2 ^ijk ^“2-“l> 


‘j 


\ \ 


^ 0 7 ^^ ' \ 

■ J'‘«i ]^2 fijk ♦ <-*k> f uk‘“> 

kj “k “k 

The contribution to the estimated covariance from any non-overlapping 

region is assumed to be zero. The left component of Eq. X-2 

determines this contribution, hence it can be eliminated. Thus the 

left hand term of R., is: 

Jk 


(X-2) 


Similarly for b^ < b^ we find; 


(X-3) 


“j ^k 

|d.i jd.2 (Cl-.,) (b 


(J 



^21 k 


(X-4) 
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substituting Eqs. X-3 and X-4 into Eq. X-1 we have; 

’’ <-V 

If the pixels being examined were pure crop 2 pixels, the expression 
eval!iated for would be the covariance ^2jk channels j and k in 

crop 2. In order to simplify the expression for a border pixel we need to 

evaluate it in the field center case. 

For crop 2, a^^ * 0 and let bj * b^ = b, hence 

b b 

^jk ■ *2jk ■ “ |'’“l 1^“2 ^2jk <“l-“2> 

0 0 



simplifying: 


b 

^ I ^2jk (i-ifL) 


Similarly for crop 1, b^ = 0^ and let a^ 


^ijk - ^ J^ijk 


(X-6) 


(X-7) 


We know have R 2 jj^ and covariance terms for channels j and k for 

crops two and one, 

For a mixed pixel, we mak? two observations, 

(1) The covariance of two points on the ground drops very 
rapidly as a function of the distance between them then: 

(2) To substitute Eqns. (X-6) and (X-7) into Eq. (X-5) we need to 
normalize by dividing respective terms by a and b, the 
widths of the respective pixels. 
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Having made these observations we can conclude for a boundary pixel, 


the covariance can be calculated using the expression 


R 




jk 


h 

a ^Ijk b ^2jk 


(X-8) 


Eq. X-8 was derived under the assumption that misregistration was in only 
one direction. The simulation model described in section 4.4.2 is based on this 
assumption. The analogy of Eq* X-8 with misregistration in two directions 
is a trivial extension of Eq. X-8 and is determined to be: 


‘^jk 



\ ^Ijk 



*^2jk 


(X-9) 


where c = d, -c. and d •» d,-c. are the heights of each resolution element, 
k k j j 

We note that in our case the widths of the respective 

pixels are the same size, hence a*b. Therefore ^ is the proportion of 

h a 

overlap in crop 1, and is the proportion of overlap in crop 2. 
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APPENDIX XI 

DESCRIPTION OF PROGRAM PEC 

PEC is a program written in the MAD language under UMESS for the 
IBM 7094. PEC will compute the expected performance matrix for the ERIM 
linear rule classifier on a given set of signatures and classifier 
parameters by using a Monte-Carlo technique. The matrix gives the probability 
that pixels from each given signature distribution will be classified into each 
given recognition class based on the best linear decision boundaries between 
recognition classes. The classifier works as follows. Between each pair of 
signatures A and B, a boundary is found to separate those pixels which might be 
classified as A from those classified as B. This boundary is a linear 
hyperplane of the form 

[x^ • C - D] » 0 

where 

X is any point on the hyperplane 
o 

C is a vector normal to the hyperplane 

D is a constant which is the distance from the origin 
to the plane in units^of the length of c. In this 
program we normalize C to be of unit length. 

If [X • C - D] < 0, then x will be classified as A; otherwise x will be 
classified as B. 
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Once these boundaries are established between all pairs of signatures, 
classification proceeds as followsv Given a pixel, it is tested againgt 
the hyperplane between signatures 1 and 2, and one of these two classes wins. 

The winning signature is tested against the third, this winner against the 
fourth, etc. The ultimate winning signature thus will emerge, and the 
exponent value will be computed. If the exponent Is less 

than a specified threshold, the point will be tabulated as belonging to the 
winning signature class, but otherwise the point will be tabulated into the 
class **unclassif led". 

A Monte-Carlo technique is employed to generate the pixel from a given 
scene class. The production of a random pixel is as follows,. We want 
y such that {y} is normally distributed with signature mean b and covariance 
R, First X is produced with each element normally distributed, so that it has 
mean 0 and covariance I (the identity, that is, channels, are uncorrelated). Then 
we want a transformation 

y = Px + b (XI-1) 

which we will apply to every x to get the corresponding y. By definition, the 
covariance R can be written 

R = E {(y-b) (y-^)^} 

where E { } or e( ) denotes the expected value of the enclosed term. 

Then 


R = E {(Px)(Px)*^} = E(Pxx‘^P*^) 

= PE(xx‘^)P*^ = PIP*^ = PP*^ 

By definition, P is the Cholesky decomposition of R. After computing P, each 
y is obtained quickly from Eq. XI-1. 
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